summaryrefslogtreecommitdiffstats
path: root/doc/changelog/v0.67.3.txt
diff options
context:
space:
mode:
Diffstat (limited to '')
-rw-r--r--doc/changelog/v0.67.3.txt700
1 files changed, 700 insertions, 0 deletions
diff --git a/doc/changelog/v0.67.3.txt b/doc/changelog/v0.67.3.txt
new file mode 100644
index 000000000..d6b1f2b27
--- /dev/null
+++ b/doc/changelog/v0.67.3.txt
@@ -0,0 +1,700 @@
+commit 408cd61584c72c0d97b774b3d8f95c6b1b06341a
+Author: Gary Lowell <gary.lowell@inktank.com>
+Date: Mon Sep 9 12:50:11 2013 -0700
+
+ v0.67.3
+
+commit 17a7342b3b935c06610c58ab92a9a1d086923d32
+Merge: b4252bf 10433bb
+Author: Sage Weil <sage@inktank.com>
+Date: Sat Sep 7 13:34:45 2013 -0700
+
+ Merge pull request #574 from dalgaaf/fix/da-dumpling-cherry-picks
+
+ init-radosgw*: fix status return value if radosgw isn't running
+
+ Reviewed-by: Sage Weil <sage@inktank.com>
+
+commit 10433bbe72dbf8eae8fae836e557a043610eb54e
+Author: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
+Date: Sat Sep 7 11:30:15 2013 +0200
+
+ init-radosgw*: fix status return value if radosgw isn't running
+
+ Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
+ (cherry picked from commit b5137baf651eaaa9f67e3864509e437f9d5c3d5a)
+
+commit b4252bff79150a95e9d075dd0b5e146ba9bf2ee5
+Author: Samuel Just <sam.just@inktank.com>
+Date: Thu Aug 22 11:19:37 2013 -0700
+
+ FileStore: add config option to disable the wbthrottle
+
+ Backport: dumpling
+ Signed-off-by: Samuel Just <sam.just@inktank.com>
+ Reviewed-by: Sage Weil <sage@inktank.com>
+ (cherry picked from commit 3528100a53724e7ae20766344e467bf762a34163)
+
+commit 699324e0910e5e07a1ac68df8cf1108e5671ec15
+Author: Samuel Just <sam.just@inktank.com>
+Date: Thu Aug 22 11:19:52 2013 -0700
+
+ WBThrottle: use fdatasync instead of fsync
+
+ Backport: dumpling
+ Signed-off-by: Samuel Just <sam.just@inktank.com>
+ Reviewed-by: Sage Weil <sage@inktank.com>
+ (cherry picked from commit d571825080f0bff1ed3666e95e19b78a738ecfe8)
+
+commit 074717b4b49ae1a55bc867e5c34d43c51edc84a5
+Author: Samuel Just <sam.just@inktank.com>
+Date: Thu Aug 29 15:08:58 2013 -0700
+
+ PGLog: initialize writeout_from in PGLog constructor
+
+ Fixes: 6151
+ Backport: dumpling
+ Signed-off-by: Samuel Just <sam.just@inktank.com>
+ Introduced: f808c205c503f7d32518c91619f249466f84c4cf
+ Reviewed-by: Sage Weil <sage@inktank.com>
+ (cherry picked from commit 42d65b0a7057696f4b8094f7c686d467c075a64d)
+
+commit c22d980cf42e580818dc9f526327518c0ddf8ff5
+Author: Samuel Just <sam.just@inktank.com>
+Date: Tue Aug 27 08:49:14 2013 -0700
+
+ PGLog: maintain writeout_from and trimmed
+
+ This way, we can avoid omap_rmkeyrange in the common append
+ and trim cases.
+
+ Fixes: #6040
+ Backport: Dumpling
+ Signed-off-by: Samuel Just <sam.just@inktank.com>
+ (cherry picked from commit f808c205c503f7d32518c91619f249466f84c4cf)
+
+commit 53c7ab4db00ec7034f5aa555231f9ee167f43201
+Author: Samuel Just <sam.just@inktank.com>
+Date: Tue Aug 27 07:27:26 2013 -0700
+
+ PGLog: don't maintain log_keys_debug if the config is disabled
+
+ Fixes: #6040
+ Backport: Dumpling
+ Signed-off-by: Samuel Just <sam.just@inktank.com>
+ (cherry picked from commit 1c0d75db1075a58d893d30494a5d7280cb308899)
+
+commit 40dc489351383c2e35b91c3d4e76b633309716df
+Author: Samuel Just <sam.just@inktank.com>
+Date: Mon Aug 26 23:19:45 2013 -0700
+
+ PGLog: move the log size check after the early return
+
+ There really are stl implementations (like the one on my ubuntu 12.04
+ machine) which have a list::size() which is linear in the size of the
+ list. That assert, therefore, is quite expensive!
+
+ Fixes: #6040
+ Backport: Dumpling
+ Signed-off-by: Samuel Just <sam.just@inktank.com>
+ (cherry picked from commit fe68b15a3d82349f8941f5b9f70fcbb5d4bc7f97)
+
+commit 4261eb5ec105b9c27605360910602dc367fd79f5
+Author: Sage Weil <sage@inktank.com>
+Date: Tue Aug 13 17:16:08 2013 -0700
+
+ rbd.cc: relicense as LGPL2
+
+ All past authors for rbd.cc have consented to relicensing from GPL to
+ LGPL2 via email:
+
+ ---
+
+ Date: Sat, 27 Jul 2013 01:59:36 +0200
+ From: Sylvain Munaut <s.munaut@whatever-company.com>
+ Subject: Re: Ceph rbd.cc GPL -> LGPL2 license change
+
+ I hereby consent to the relicensing of any contribution I made to the
+ aforementioned rbd.cc file from GPL to LGPL2.1.
+
+ (I hope that'll be impressive enough, I did my best :p)
+
+ btw, tnt@246tNt.com and s.munaut@whatever-company.com are both me.
+
+ Cheers,
+
+ Sylvain
+
+ ---
+
+ Date: Fri, 26 Jul 2013 17:00:48 -0700
+ From: Yehuda Sadeh <yehuda@inktank.com>
+ Subject: Re: Ceph rbd.cc GPL -> LGPL2 license change
+
+ I consent.
+
+ ---
+
+ Date: Fri, 26 Jul 2013 17:02:24 -0700
+ From: Josh Durgin <josh.durgin@inktank.com>
+ Subject: Re: Ceph rbd.cc GPL -> LGPL2 license change
+
+ I consent.
+
+ ---
+
+ Date: Fri, 26 Jul 2013 18:17:46 -0700
+ From: Stanislav Sedov <stas@freebsd.org>
+ Subject: Re: Ceph rbd.cc GPL -> LGPL2 license change
+
+ I consent.
+
+ Thanks for taking care of it!
+
+ ---
+
+ Date: Fri, 26 Jul 2013 18:24:15 -0700
+ From: Colin McCabe <cmccabe@alumni.cmu.edu>
+
+ I consent.
+
+ cheers,
+ Colin
+
+ ---
+
+ Date: Sat, 27 Jul 2013 07:08:12 +0200
+ From: Christian Brunner <christian@brunner-muc.de>
+ Subject: Re: Ceph rbd.cc GPL -> LGPL2 license change
+
+ I consent
+
+ Christian
+
+ ---
+
+ Date: Sat, 27 Jul 2013 12:17:34 +0300
+ From: Stratos Psomadakis <psomas@grnet.gr>
+ Subject: Re: Ceph rbd.cc GPL -> LGPL2 license change
+
+ Hi,
+
+ I consent with the GPL -> LGL2.1 re-licensing.
+
+ Thanks
+ Stratos
+
+ ---
+
+ Date: Sat, 27 Jul 2013 16:13:13 +0200
+ From: Wido den Hollander <wido@42on.com>
+ Subject: Re: Ceph rbd.cc GPL -> LGPL2 license change
+
+ I consent!
+
+ You have my permission to re-license the code I wrote for rbd.cc to LGPL2.1
+
+ ---
+
+ Date: Sun, 11 Aug 2013 10:40:32 +0200
+ From: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
+ Subject: Re: btw
+
+ Hi Sage,
+
+ I agree to switch the license of ceph_argparse.py and rbd.cc from GPL2
+ to LGPL2.
+
+ Regards
+
+ Danny Al-Gaaf
+
+ ---
+
+ Date: Tue, 13 Aug 2013 17:15:24 -0700
+ From: Dan Mick <dan.mick@inktank.com>
+ Subject: Re: Ceph rbd.cc GPL -> LGPL2 license change
+
+ I consent to relicense any contributed code that I wrote under LGPL2.1 license.
+
+ ---
+
+ ...and I consent too. Drop the exception from COPYING and debian/copyright
+ files.
+
+ Signed-off-by: Sage Weil <sage@inktank.com>
+ (cherry picked from commit 2206f55761c675b31078dea4e7dd66f2666d7d03)
+
+commit 211c5f13131e28b095a1f3b72426128f1db22218
+Author: Yehuda Sadeh <yehuda@inktank.com>
+Date: Fri Aug 23 15:39:20 2013 -0700
+
+ rgw: flush pending data when completing multipart part upload
+
+ Fixes: #6111
+ Backport: dumpling
+ When completing the part upload we need to flush any data that we
+ aggregated and didn't flush yet. With earlier code didn't have to deal
+ with it as for multipart upload we didn't have any pending data.
+ What we do now is we call the regular atomic data completion
+ function that takes care of it.
+
+ Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
+ Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
+ (cherry picked from commit 9a551296e0811f2b65972377b25bb28dbb42f575)
+
+commit 1a9651010aab51c9be2edeccd80e9bd11f5177ce
+Author: Yehuda Sadeh <yehuda@inktank.com>
+Date: Mon Aug 26 19:46:43 2013 -0700
+
+ rgw: check object name after rebuilding it in S3 POST
+
+ Fixes: #6088
+ Backport: bobtail, cuttlefish, dumpling
+
+ When posting an object it is possible to provide a key
+ name that refers to the original filename, however we
+ need to verify that in the end we don't end up with an
+ empty object name.
+
+ Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
+ Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
+ (cherry picked from commit c8ec532fadc0df36e4b265fe20a2ff3e35319744)
+
+commit 1bd74a020b93f154b2d4129d512f6334387de7c7
+Author: Sage Weil <sage@inktank.com>
+Date: Thu Aug 22 17:46:45 2013 -0700
+
+ mon/MonClient: release pending outgoing messages on shutdown
+
+ This fixes a small memory leak when we have messages queued for the mon
+ when we shut down. It is harmless except for the valgrind leak check
+ noise that obscures real leaks.
+
+ Backport: dumpling
+ Signed-off-by: Sage Weil <sage@inktank.com>
+ (cherry picked from commit 309569a6d0b7df263654b7f3f15b910a72f2918d)
+
+commit 24f2669783e2eb9d9af5ecbe106efed93366ba63
+Author: Yehuda Sadeh <yehuda@inktank.com>
+Date: Thu Aug 29 13:06:33 2013 -0700
+
+ rgw: change watch init ordering, don't distribute if can't
+
+ Backport: dumpling
+
+ Moving back the watch initialization after the zone init,
+ as the zone info holds the control pool name. Since zone
+ init might need to create a new system object (that needs
+ to distribute cache), don't try to distribute cache if
+ watch is not yet initialized.
+
+ Reviewed-by: Sage Weil <sage@inktank.com>
+ Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
+ (cherry picked from commit 1d1f7f18dfbdc46fdb09a96ef973475cd29feef5)
+
+commit a708c8ab52e5b1476405a1f817c23b8845fbaab3
+Author: Sage Weil <sage@inktank.com>
+Date: Fri Aug 30 09:41:29 2013 -0700
+
+ ceph-post-file: use mktemp instead of tempfile
+
+ tempfile is a debian thing, apparently; mktemp is present everywhere.
+
+ Signed-off-by: Sage Weil <sage@inktank.com>
+ (cherry picked from commit e60d4e09e9f11e3c34a05cd122341e06c7c889bb)
+
+commit 625f13ee0d6cca48d61dfd65e00517d092552d1c
+Author: Sage Weil <sage@inktank.com>
+Date: Wed Aug 28 09:50:11 2013 -0700
+
+ mon: discover mon addrs, names during election state too
+
+ Currently we only detect new mon addrs and names during the probing phase.
+ For non-trivial clusters, this means we can get into a sticky spot when
+ we discover enough peers to form an quorum, but not all of them, and the
+ undiscovered ones are enough to break the mon ranks and prevent an
+ election.
+
+ One way to work around this is to continue addr and name discovery during
+ the election. We should also consider making the ranks less sensitive to
+ the undefined addrs; that is a separate change.
+
+ Fixes: #4924
+ Backport: dumpling
+ Signed-off-by: Sage Weil <sage@inktank.com>
+ Tested-by: Bernhard Glomm <bernhard.glomm@ecologic.eu>
+ (cherry picked from commit c24028570015cacf1d9e154ffad80bec06a61e7c)
+
+commit 83cfd4386c1fd0fa41aea345704e27f82b524ece
+Author: Dan Mick <dan.mick@inktank.com>
+Date: Thu Aug 22 17:30:24 2013 -0700
+
+ ceph_rest_api.py: create own default for log_file
+
+ common/config thinks the default log_file for non-daemons should be "".
+ Override that so that the default is
+ /var/log/ceph/{cluster}-{name}.{pid}.log
+ since ceph-rest-api is more of a daemon than a client.
+
+ Fixes: #6099
+ Backport: dumpling
+ Signed-off-by: Dan Mick <dan.mick@inktank.com>
+ (cherry picked from commit 2031f391c3df68e0d9e381a1ef3fe58d8939f0a8)
+
+commit 8a1da62d9564a32f7b8963fe298e1ac3ad0ea3d9
+Author: Sage Weil <sage@inktank.com>
+Date: Fri Aug 16 17:59:11 2013 -0700
+
+ ceph-post-file: single command to upload a file to cephdrop
+
+ Use sftp to upload to a directory that only this user and ceph devs can
+ access.
+
+ Distribute an ssh key to connect to the account. This will let us revoke
+ the key in the future if we feel the need. Also distribute a known_hosts
+ file so that users have some confidence that they are connecting to the
+ real ceph drop account and not some third party.
+
+ Signed-off-by: Sage Weil <sage@inktank.com>
+ Reviewed-by: Dan Mick <dan.mick@inktank.com>
+ (cherry picked from commit d08e05e463f1f7106a1f719d81b849435790a3b9)
+
+commit 3f8663477b585dcb528fdd7047c50d9a52d24b95
+Author: Gary Lowell <glowell@inktank.com>
+Date: Thu Aug 22 13:29:32 2013 -0700
+
+ ceph.spec.in: remove trailing paren in previous commit
+
+ Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
+
+commit 23fb908cb3ac969c874ac12755d20ed2f636e1b9
+Author: Gary Lowell <glowell@inktank.com>
+Date: Thu Aug 22 11:07:16 2013 -0700
+
+ ceph.spec.in: Don't invoke debug_package macro on centos.
+
+ If the redhat-rpm-config package is installed, the debuginfo rpms will
+ be built by default. The build will fail when the package installed
+ and the specfile also invokes the macro.
+
+ Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
+
+commit 11f5853d8178ab60ab948d373c1a1f67324ce3bd
+Author: Sage Weil <sage@inktank.com>
+Date: Sat Aug 24 14:04:09 2013 -0700
+
+ osd: install admin socket commands after signals
+
+ This lets us tell by the presence of the admin socket commands whether
+ a signal will make us shut down cleanly. See #5924.
+
+ Signed-off-by: Sage Weil <sage@inktank.com>
+ Reviewed-by: Samuel Just <sam.just@inktank.com>
+ (cherry picked from commit c5b5ce120a8ce9116be52874dbbcc39adec48b5c)
+
+commit 39adc0195e6016ce36828885515be1bffbc10ae1
+Author: Sage Weil <sage@inktank.com>
+Date: Tue Aug 20 22:39:09 2013 -0700
+
+ ceph-disk: partprobe after creating journal partition
+
+ At least one user reports that a partprobe is needed after creating the
+ journal partition. It is not clear why sgdisk is not doing it, but this
+ fixes ceph-disk for them, and should be harmless for other users.
+
+ Fixes: #5599
+ Tested-by: lurbs in #ceph
+ Signed-off-by: Sage Weil <sage@inktank.com>
+ (cherry picked from commit 2af59d5e81c5e3e3d7cfc50d9330d7364659c5eb)
+ (cherry picked from commit 3e42df221315679605d68b2875aab6c7eb6b3cc4)
+
+commit 6a4fe7b9b068ae990d6404921a46631fe9ebcd31
+Author: Sage Weil <sage@inktank.com>
+Date: Tue Aug 20 11:27:23 2013 -0700
+
+ mon/Paxos: always refresh after any store_state
+
+ If we store any new state, we need to refresh the services, even if we
+ are still in the midst of Paxos recovery. This is because the
+ subscription path will share any committed state even when paxos is
+ still recovering. This prevents a race like:
+
+ - we have maps 10..20
+ - we drop out of quorum
+ - we are elected leader, paxos recovery starts
+ - we get one LAST with committed states that trim maps 10..15
+ - we get a subscribe for map 10..20
+ - we crash because 10 is no longer on disk because the PaxosService
+ is out of sync with the on-disk state.
+
+ Fixes: #6045
+ Backport: dumpling
+ Signed-off-by: Sage Weil <sage@inktank.com>
+ Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
+ (cherry picked from commit 981eda9f7787c83dc457f061452685f499e7dd27)
+
+commit 13d396e46ed9200e4b9f21db2f0a8efbc5998d82
+Author: Sage Weil <sage@inktank.com>
+Date: Tue Aug 20 11:27:09 2013 -0700
+
+ mon/Paxos: return whether store_state stored anything
+
+ Signed-off-by: Sage Weil <sage@inktank.com>
+ Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
+ (cherry picked from commit 7e0848d8f88f156a05eef47a9f730b772b64fbf2)
+
+commit f248383bacff76203fa94716cfdf6cf766da24a7
+Author: Sage Weil <sage@inktank.com>
+Date: Tue Aug 20 11:26:57 2013 -0700
+
+ mon/Paxos: cleanup: use do_refresh from handle_commit
+
+ This avoid duplicated code by using the helper created exactly for this
+ purpose.
+
+ Signed-off-by: Sage Weil <sage@inktank.com>
+ Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
+ (cherry picked from commit b9dee2285d9fe8533fa98c940d5af7b0b81f3d33)
+
+commit 02608a12d4e7592784148a62a47d568efc24079d
+Author: Sage Weil <sage@inktank.com>
+Date: Thu Aug 15 21:48:06 2013 -0700
+
+ osdc/ObjectCacher: do not merge rx buffers
+
+ We do not try to merge rx buffers currently. Make that explicit and
+ documented in the code that it is not supported. (Otherwise the
+ last_read_tid values will get lost and read results won't get applied
+ to the cache properly.)
+
+ Signed-off-by: Sage Weil <sage@inktank.com>
+ (cherry picked from commit 1c50c446152ab0e571ae5508edb4ad7c7614c310)
+
+commit 0e2bfe71965eeef29b47e8032637ea820a7ce49c
+Author: Sage Weil <sage@inktank.com>
+Date: Thu Aug 15 21:47:18 2013 -0700
+
+ osdc/ObjectCacher: match reads with their original rx buffers
+
+ Consider a sequence like:
+
+ 1- start read on 100~200
+ 100~200 state rx
+ 2- truncate to 200
+ 100~100 state rx
+ 3- start read on 200~200
+ 100~100 state rx
+ 200~200 state rx
+ 4- get 100~200 read result
+
+ Currently this makes us crash on
+
+ osdc/ObjectCacher.cc: 738: FAILED assert(bh->length() <= start+(loff_t)length-opos)
+
+ when processing the second 200~200 bufferhead (it is too big). The
+ larger issue, though, is that we should not be looking at this data at
+ all; it has been truncated away.
+
+ Fix this by marking each rx buffer with the read request that is sent to
+ fill it, and only fill it from that read request. Then the first reply
+ will fill the first 100~100 extend but not touch the other extent; the
+ second read will do that.
+
+ Signed-off-by: Sage Weil <sage@inktank.com>
+ (cherry picked from commit b59f930ae147767eb4c9ff18c3821f6936a83227)
+
+commit 6b51c960715971a0351e8203d4896cb0c4138a3f
+Author: Sage Weil <sage@inktank.com>
+Date: Thu Aug 22 15:54:48 2013 -0700
+
+ mon/Paxos: fix another uncommitted value corner case
+
+ It is possible that we begin the paxos recovery with an uncommitted
+ value for, say, commit 100. During last/collect we discover 100 has been
+ committed already. But also, another node provides an uncommitted value
+ for 101 with the same pn. Currently, we refuse to learn it, because the
+ pn is not strictly > than our current uncommitted pn... even though it is
+ the next last_committed+1 value that we need.
+
+ There are two possible fixes here:
+
+ - make this a >= as we can accept newer values from the same pn.
+ - discard our uncommitted value metadata when we commit the value.
+
+ Let's do both!
+
+ Fixes: #6090
+ Signed-off-by: Sage Weil <sage@inktank.com>
+ (cherry picked from commit fe5010380a3a18ca85f39403e8032de1dddbe905)
+
+commit b3a280d5af9d06783d2698bd434940de94ab0fda
+Author: Sage Weil <sage@inktank.com>
+Date: Fri Aug 23 11:45:35 2013 -0700
+
+ os: make readdir_r buffers larger
+
+ PATH_MAX isn't quite big enough.
+
+ Backport: dumpling, cuttlefish, bobtail
+ Signed-off-by: Sage Weil <sage@inktank.com>
+ (cherry picked from commit 99a2ff7da99f8cf70976f05d4fe7aa28dd7afae5)
+
+commit 989a664ef0d1c716cab967f249112f595cf98c43
+Author: Sage Weil <sage@inktank.com>
+Date: Fri Aug 23 11:45:08 2013 -0700
+
+ os: fix readdir_r buffer size
+
+ The buffer needs to be big or else we're walk all over the stack.
+
+ Backport: dumpling, cuttlefish, bobtail
+ Signed-off-by: Sage Weil <sage@inktank.com>
+ (cherry picked from commit 2df66d9fa214e90eb5141df4d5755b57e8ba9413)
+
+ Conflicts:
+
+ src/os/BtrfsFileStoreBackend.cc
+
+commit a4cca31c82bf0e84272e01eb1b3188dfdb5b5615
+Author: Yehuda Sadeh <yehuda@inktank.com>
+Date: Thu Aug 22 10:53:12 2013 -0700
+
+ rgw: fix crash when creating new zone on init
+
+ Moving the watch/notify init before the zone init,
+ as we might need to send a notification.
+
+ Reviewed-by: Sage Weil <sage@inktank.com>
+ Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
+ (cherry picked from commit 3d55534268de7124d29bd365ea65da8d2f63e501)
+
+commit 4cf6996803ef66f2b6083f73593259d45e2740a3
+Author: Yehuda Sadeh <yehuda@inktank.com>
+Date: Mon Aug 19 08:40:16 2013 -0700
+
+ rgw: change cache / watch-notify init sequence
+
+ Fixes: #6046
+ We were initializing the watch-notify (through the cache
+ init) before reading the zone info which was much too
+ early, as we didn't have the control pool name yet. Now
+ simplifying init/cleanup a bit, cache doesn't call watch/notify
+ init and cleanup directly, but rather states its need
+ through a virtual callback.
+
+ Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
+ Reviewed-by: Sage Weil <sage@inktank.com>
+ (cherry picked from commit d26ba3ab0374e77847c742dd00cb3bc9301214c2)
+
+commit aea6de532b0b843c3a8bb76d10bab8476f0d7c09
+Author: Alexandre Oliva <oliva@gnu.org>
+Date: Thu Aug 22 03:40:22 2013 -0300
+
+ enable mds rejoin with active inodes' old parent xattrs
+
+ When the parent xattrs of active inodes that the mds attempts to open
+ during rejoin lack pool info (struct_v < 5), this field will be filled
+ in with -1, causing the mds to retry fetching a backtrace with a pool
+ number that matches the expected value, which fails and causes the
+ err==-ENOENT branch to be taken and retry pool 1, which succeeds, but
+ with pool -1, and so keeps on bouncing between the two retry cases
+ forever.
+
+ This patch arranges for the mds to go along with pool -1 instead of
+ insisting that it be refetched, enabling it to complete recovery
+ instead of eating cpu, network bandwidth and metadata osd's resources
+ like there's no tomorrow, in what AFAICT is an infinite and very busy
+ loop.
+
+ This is not a new problem: I've had it even before upgrading from
+ Cuttlefish to Dumpling, I'd just never managed to track it down, and
+ force-unmounting the filesystem and then restarting the mds was an
+ easier (if inconvenient) work-around, particularly because it always
+ hit when the filesystem was under active, heavy-ish use (or there
+ wouldn't be much reason for caps recovery ;-)
+
+ There are two issues not addressed in this patch, however. One is
+ that nothing seems to proactively update the parent xattr when it is
+ found to be outdated, so it remains out of date forever. Not even
+ renaming top-level directories causes the xattrs to be recursively
+ rewritten. AFAICT that's a bug.
+
+ The other is that inodes that don't have a parent xattr (created by
+ even older versions of ceph) are reported as non-existing in the mds
+ rejoin message, because the absence of the parent xattr is signaled as
+ a missing inode (?failed to reconnect caps for missing inodes?). I
+ suppose this may cause more serious recovery problems.
+
+ I suppose a global pass over the filesystem tree updating parent
+ xattrs that are out-of-date would be desirable, if we find any parent
+ xattrs still lacking current information; it might make sense to
+ activate it as a background thread from the backtrace decoding
+ function, when it finds a parent xattr that's too out-of-date, or as a
+ separate client (ceph-fsck?).
+
+ Backport: dumpling, cuttlefish
+ Signed-off-by: Alexandre Oliva <oliva@gnu.org>
+ Reviewed-by: Zheng, Yan <zheng.z.yan@intel.com>
+ (cherry picked from commit 617dc36d477fd83b2d45034fe6311413aa1866df)
+
+commit 0738bdf92f5e5eb93add152a4135310ac7ea1c91
+Author: David Disseldorp <ddiss@suse.de>
+Date: Mon Jul 29 17:05:44 2013 +0200
+
+ mds: remove waiting lock before merging with neighbours
+
+ CephFS currently deadlocks under CTDB's ping_pong POSIX locking test
+ when run concurrently on multiple nodes.
+ The deadlock is caused by failed removal of a waiting_locks entry when
+ the waiting lock is merged with an existing lock, e.g:
+
+ Initial MDS state (two clients, same file):
+ held_locks -- start: 0, length: 1, client: 4116, pid: 7899, type: 2
+ start: 2, length: 1, client: 4110, pid: 40767, type: 2
+ waiting_locks -- start: 1, length: 1, client: 4116, pid: 7899, type: 2
+
+ Waiting lock entry 4116@1:1 fires:
+ handle_client_file_setlock: start: 1, length: 1,
+ client: 4116, pid: 7899, type: 2
+
+ MDS state after lock is obtained:
+ held_locks -- start: 0, length: 2, client: 4116, pid: 7899, type: 2
+ start: 2, length: 1, client: 4110, pid: 40767, type: 2
+ waiting_locks -- start: 1, length: 1, client: 4116, pid: 7899, type: 2
+
+ Note that the waiting 4116@1:1 lock entry is merged with the existing
+ 4116@0:1 held lock to become a 4116@0:2 held lock. However, the now
+ handled 4116@1:1 waiting_locks entry remains.
+
+ When handling a lock request, the MDS calls adjust_locks() to merge
+ the new lock with available neighbours. If the new lock is merged,
+ then the waiting_locks entry is not located in the subsequent
+ remove_waiting() call because adjust_locks changed the new lock to
+ include the old locks.
+ This fix ensures that the waiting_locks entry is removed prior to
+ modification during merge.
+
+ Signed-off-by: David Disseldorp <ddiss@suse.de>
+ Reviewed-by: Greg Farnum <greg@inktank.com>
+ (cherry picked from commit 476e4902907dfadb3709ba820453299ececf990b)
+
+commit a0ac88272511d670b5c3756dda2d02c93c2e9776
+Author: Dan Mick <dan.mick@inktank.com>
+Date: Tue Aug 20 11:10:42 2013 -0700
+
+ mon/PGMap: OSD byte counts 4x too large (conversion to bytes overzealous)
+
+ Fixes: #6049
+ Signed-off-by: Dan Mick <dan.mick@inktank.com>
+ (cherry picked from commit eca53bbf583027397f0d5e050a76498585ecb059)
+
+commit 87b19c33ce29e2ca4fc49a2adeb12d3f14ca90a9
+Author: Alfredo Deza <alfredo.deza@inktank.com>
+Date: Fri Aug 23 08:56:07 2013 -0400
+
+ ceph-disk: specify the filetype when mounting
+
+ Signed-off-by: Alfredo Deza <alfredo.deza@inktank.com>
+ Reviewed-by: Sage Weil <sage@inktank.com>
+ (cherry picked from commit f040020fb2a7801ebbed23439159755ff8a3edbd)