summaryrefslogtreecommitdiff
path: root/fs/btrfs/inode.c
Commit message (Collapse)AuthorAge
...
| * btrfs: Better csum error message for data csum mismatchQu Wenruo2017-02-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The original csum error message only outputs inode number, offset, check sum and expected check sum. However no root objectid is outputted, which sometimes makes debugging quite painful under multi-subvolume case (including relocation). Also the checksum output is decimal, which seldom makes sense for users/developers and is hard to read in most time. This patch will add root objectid, which will be %lld for rootid larger than LAST_FREE_OBJECTID, and hex csum output for better readability. Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * Btrfs: kill trans in run_delalloc_nocow and btrfs_cross_ref_existLiu Bo2017-02-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | run_delalloc_nocow has used trans in two places where they don't actually need @trans. For btrfs_lookup_file_extent, we search for file extents without COWing anything, and for btrfs_cross_ref_exist, the only place where we need @trans is deferencing it in order to get running_transaction which we could easily get from the global fs_info. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * Btrfs: fix wrong argument for btrfs_lookup_ordered_rangeLiu Bo2017-02-14
| | | | | | | | | | | | | | | | | | | | Commit Btrfs: btrfs_page_mkwrite: Reserve space in sectorsized units" (d0b7da88) did this, but btrfs_lookup_ordered_range expects a 'length' rather than a 'page_end'. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * Btrfs: fix another race between truncate and lockless dio writeLiu Bo2017-02-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dio writes can update i_size in btrfs_get_blocks_direct when it writes to offset beyond EOF so that endio can update disk_i_size correctly (because we don't udpate disk_i_size beyond i_size). However, when truncating down a file, we firstly update i_size and then wait for in-flight lockless dio reads/writes, according to the above, i_size may have been changed in dio writes, and file extents don't get truncated. For lockless dio writes are always overwrites, i_size is not supposed to be changed, so this adds a check to filter out this case. The race could be reproduced by fstests/generic/299 with patch "Btrfs: fix btrfs_ordered_update_i_size to update disk_i_size properly" applied. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * Btrfs: fix comment in btrfs_page_mkwriteLiu Bo2017-02-14
| | | | | | | | | | | | | | | | | | | | | | | | | | The comment about "page_mkwrite gets called every time the page is dirtied" in btrfs_page_mkwrite is not correct, it only gets called the first time the page gets dirtied after the page faults in. However, we don't need to touch the code because it works well, although the proper logic is to check if delalloc bits has been set and if so, go free reserved space, if not, set the delalloc bits for dirty page range. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * Btrfs: fix btrfs_ordered_update_i_size to update disk_i_size properlyLiu Bo2017-02-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | btrfs_ordered_update_i_size can be called by truncate and endio, but only endio takes ordered_extent which contains the completed IO. while truncating down a file, if there are some in-flight IOs, btrfs_ordered_update_i_size in endio will set disk_i_size to @orig_offset that is zero. If truncating-down fails somehow, we try to recover in memory isize with this zero'd disk_i_size. Fix it by only updating disk_i_size with @orig_offset when btrfs_ordered_update_i_size is not called from endio while truncating down and waiting for in-flight IOs completing their work before recover in-memory size. Besides fixing the above issue, add an assertion for last_size to double check we truncate down to the desired size. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: fix over-80 lines introduced by previous cleanupsDavid Sterba2017-02-14
| | | | | | | | | | | | | | This goes as a separate patch because fixing that inside the patches caused too many many conflicts. Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: Make btrfs_unlink_inode take btrfs_inodeNikolay Borisov2017-02-14
| | | | | | | | | | Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: Make btrfs_del_inode_ref take btrfs_inodeNikolay Borisov2017-02-14
| | | | | | | | | | Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: Make btrfs_del_dir_entries_in_log take btrfs_inodeNikolay Borisov2017-02-14
| | | | | | | | | | Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: Make btrfs_log_new_name take btrfs_inodeNikolay Borisov2017-02-14
| | | | | | | | | | Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: Make btrfs_inode_in_log take btrfs_inodeNikolay Borisov2017-02-14
| | | | | | | | | | Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: Make btrfs_record_unlink_dir take btrfs_inodeNikolay Borisov2017-02-14
| | | | | | | | | | Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: Make btrfs_inode_delayed_dir_index_count take btrfs_inodeNikolay Borisov2017-02-14
| | | | | | | | | | Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: Make btrfs_commit_inode_delayed_inode take btrfs_inodeNikolay Borisov2017-02-14
| | | | | | | | | | Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: Make btrfs_remove_delayed_node take btrfs_inodeNikolay Borisov2017-02-14
| | | | | | | | | | Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: Make btrfs_kill_delayed_inode_items take btrfs_inodeNikolay Borisov2017-02-14
| | | | | | | | | | Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: Make btrfs_delayed_delete_inode_ref take btrfs_inodeNikolay Borisov2017-02-14
| | | | | | | | | | Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: Make btrfs_delete_delayed_dir_index take btrfs_inodeNikolay Borisov2017-02-14
| | | | | | | | | | Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: Make btrfs_ino take a struct btrfs_inodeNikolay Borisov2017-02-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently btrfs_ino takes a struct inode and this causes a lot of internal btrfs functions which consume this ino to take a VFS inode, rather than btrfs' own struct btrfs_inode. In order to fix this "leak" of VFS structs into the internals of btrfs first it's necessary to eliminate all uses of struct inode for the purpose of inode. This patch does that by using BTRFS_I to convert an inode to btrfs_inode. With this problem eliminated subsequent patches will start eliminating the passing of struct inode altogether, eventually resulting in a lot cleaner code. Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com> [ fix btrfs_get_extent tracepoint prototype ] Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: add wrapper for counting BTRFS_MAX_EXTENT_SIZEDavid Sterba2017-02-14
| | | | | | | | | | | | | | | | | | | | | | The expression is open-coded in several places, this asks for a wrapper. As we know the MAX_EXTENT fits to u32, we can use the appropirate division helper. This cascades to the result type updates. Compiler is clever enough to use shift instead of integer division, so there's no change in the generated assembly. Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: remove unused logic of limiting async delalloc pagesDavid Sterba2017-02-14
| | | | | | | | | | | | | | | | | | A proposed patch in https://marc.info/?l=linux-btrfs&m=147859791003837 pointed out bad limit threshold in cow_file_range_async, but it turned out that the whole logic is not necessary and is done by writeback. We agreed to remove it. Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: consolidate auto defrag kick off policiesAnand Jain2017-02-14
| | | | | | | | | | | | | | | | | | As of now writes smaller than 64k for non compressed extents and 16k for compressed extents inside eof are considered as candidate for auto defrag, put them together at a place. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: use BTRFS_COMPRESS_NONE to specify no compressionAnand Jain2017-02-14
| | | | | | | | | | Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: fix up misleading GFP_NOFS usage in btrfs_releasepageMichal Hocko2017-02-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | b335b0034e25 ("Btrfs: Avoid using __GFP_HIGHMEM with slab allocator") has reduced the allocation mask in btrfs_releasepage to GFP_NOFS just to prevent from giving an unappropriate gfp mask to the slab allocator deeper down the callchain (in alloc_extent_state). This is wrong for two reasons a) GFP_NOFS might be just too restrictive for the calling context b) it is better to tweak the gfp mask down when it needs that. So just remove the mask tweaking from btrfs_releasepage and move it down to alloc_extent_state where it is needed. Signed-off-by: Michal Hocko <mhocko@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
* | mm, fs: reduce fault, page_mkwrite, and pfn_mkwrite to take only vmfDave Jiang2017-02-24
|/ | | | | | | | | | | | | | | | | | | | | | | ->fault(), ->page_mkwrite(), and ->pfn_mkwrite() calls do not need to take a vma and vmf parameter when the vma already resides in vmf. Remove the vma parameter to simplify things. [arnd@arndb.de: fix ARM build] Link: http://lkml.kernel.org/r/20170125223558.1451224-1-arnd@arndb.de Link: http://lkml.kernel.org/r/148521301778.19116.10840599906674778980.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'for-linus-4.10' of ↵Linus Torvalds2017-01-27
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs updates from Chris Mason: "Some fixes that we've collected from the list. We still have one more pending to nail down a regression in lzo compression, but I wanted to get this batch out the door" * 'for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: remove ->{get, set}_acl() from btrfs_dir_ro_inode_operations Btrfs: disable xattr operations on subvolume directories Btrfs: remove old tree_root case in btrfs_read_locked_inode() Btrfs: fix truncate down when no_holes feature is enabled Btrfs: Fix deadlock between direct IO and fast fsync btrfs: fix false enospc error when truncating heavily reflinked file
| * Btrfs: remove ->{get, set}_acl() from btrfs_dir_ro_inode_operationsOmar Sandoval2017-01-26
| | | | | | | | | | | | | | | | | | Subvolume directory inodes can't have ACLs. Cc: <stable@vger.kernel.org> # 4.9.x Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
| * Btrfs: disable xattr operations on subvolume directoriesOmar Sandoval2017-01-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When you snapshot a subvolume containing a subvolume, you get a placeholder directory where the subvolume would be. These directory inodes have ->i_ops set to btrfs_dir_ro_inode_operations. Previously, these i_ops didn't include the xattr operation callbacks. The conversion to xattr_handlers missed this case, leading to bogus attempts to set xattrs on these inodes. This manifested itself as failures when running delayed inodes. To fix this, clear IOP_XATTR in ->i_opflags on these inodes. Fixes: 6c6ef9f26e59 ("xattr: Stop calling {get,set,remove}xattr inode operations") Cc: Andreas Gruenbacher <agruenba@redhat.com> Reported-by: Chris Murphy <lists@colorremedies.com> Tested-by: Chris Murphy <lists@colorremedies.com> Cc: <stable@vger.kernel.org> # 4.9.x Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
| * Btrfs: remove old tree_root case in btrfs_read_locked_inode()Omar Sandoval2017-01-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | As Jeff explained in c2951f32d36c ("btrfs: remove old tree_root dirent processing in btrfs_real_readdir()"), supporting this old format is no longer necessary since the Btrfs magic number has been updated since we changed to the current format. There are other places where we still handle this old format, but since this is part of a fix that is going to stable, I'm only removing this one for now. Cc: <stable@vger.kernel.org> # 4.9.x Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
| * Btrfs: fix truncate down when no_holes feature is enabledLiu Bo2017-01-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For such a file mapping, [0-4k][hole][8k-12k] In NO_HOLES mode, we don't have the [hole] extent any more. Commit c1aa45759e90 ("Btrfs: fix shrinking truncate when the no_holes feature is enabled") fixed disk isize not being updated in NO_HOLES mode when data is not flushed. However, even if data has been flushed, we can still have trouble in updating disk isize since we updated disk isize to 'start' of the last evicted extent. Reviewed-by: Chris Mason <clm@fb.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * Btrfs: Fix deadlock between direct IO and fast fsyncChandan Rajendra2017-01-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following deadlock is seen when executing generic/113 test, ---------------------------------------------------------+---------------------------------------------------- Direct I/O task Fast fsync task ---------------------------------------------------------+---------------------------------------------------- btrfs_direct_IO __blockdev_direct_IO do_blockdev_direct_IO do_direct_IO btrfs_get_blocks_direct while (blocks needs to written) get_more_blocks (first iteration) btrfs_get_blocks_direct btrfs_create_dio_extent down_read(&BTRFS_I(inode) >dio_sem) Create and add extent map and ordered extent up_read(&BTRFS_I(inode) >dio_sem) btrfs_sync_file btrfs_log_dentry_safe btrfs_log_inode_parent btrfs_log_inode btrfs_log_changed_extents down_write(&BTRFS_I(inode) >dio_sem) Collect new extent maps and ordered extents wait for ordered extent completion get_more_blocks (second iteration) btrfs_get_blocks_direct btrfs_create_dio_extent down_read(&BTRFS_I(inode) >dio_sem) -------------------------------------------------------------------------------------------------------------- In the above description, Btrfs direct I/O code path has not yet started submitting bios for file range covered by the initial ordered extent. Meanwhile, The fast fsync task obtains the write semaphore and waits for I/O on the ordered extent to get completed. However, the Direct I/O task is now blocked on obtaining the read semaphore. To resolve the deadlock, this commit modifies the Direct I/O code path to obtain the read semaphore before invoking __blockdev_direct_IO(). The semaphore is then given up after __blockdev_direct_IO() returns. This allows the Direct I/O code to complete I/O on all the ordered extents it creates. Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: fix false enospc error when truncating heavily reflinked fileWang Xiaoguang2017-01-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Below test script can reveal this bug: dd if=/dev/zero of=fs.img bs=$((1024*1024)) count=100 dev=$(losetup --show -f fs.img) mkdir -p /mnt/mntpoint mkfs.btrfs -f $dev mount $dev /mnt/mntpoint cd /mnt/mntpoint echo "workdir is: /mnt/mntpoint" blocksize=$((128 * 1024)) dd if=/dev/zero of=testfile bs=$blocksize count=1 sync count=$((17*1024*1024*1024/blocksize)) echo "file size is:" $((count*blocksize)) for ((i = 1; i <= $count; i++)); do dst_offset=$((blocksize * i)) xfs_io -f -c "reflink testfile 0 $dst_offset $blocksize"\ testfile > /dev/null done sync truncate --size 0 testfile The last truncate operation will fail for ENOSPC reason, but indeed it should not fail. In btrfs_truncate(), we use a temporary block_rsv to do truncate operation. With every btrfs_truncate_inode_items() call, we migrate space to this block_rsv, but forget to cleanup previous reservation, which will make this block_rsv's reserved bytes keep growing, and this reserved space will only be released in the end of btrfs_truncate(), this metadata leak will impact other's metadata reservation. In this case, it's "btrfs_start_transaction(root, 2);" fails for enospc error, which make this truncate operation fail. Call btrfs_block_rsv_release() to fix this bug. Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
* | Merge branch 'for-linus-4.10' of ↵Linus Torvalds2017-01-13
|\ \ | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "These are all over the place. The tracepoint part of the pull fixes a crash and adds a little more information to two tracepoints, while the rest are good old fashioned fixes" * 'for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: btrfs: make tracepoint format strings more compact Btrfs: add truncated_len for ordered extent tracepoints Btrfs: add 'inode' for extent map tracepoint btrfs: fix crash when tracepoint arguments are freed by wq callbacks Btrfs: adjust outstanding_extents counter properly when dio write is split Btrfs: fix lockdep warning about log_mutex Btrfs: use down_read_nested to make lockdep silent btrfs: fix locking when we put back a delayed ref that's too new btrfs: fix error handling when run_delayed_extent_op fails btrfs: return the actual error value from from btrfs_uuid_tree_iterate
| * Merge branch 'tracepoint-updates-4.10' of ↵Chris Mason2017-01-11
| |\ | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.10
| | * Btrfs: add 'inode' for extent map tracepointLiu Bo2017-01-09
| | | | | | | | | | | | | | | | | | | | | | | | 'inode' is an important field for btrfs_get_extent, lets trace it. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | Btrfs: adjust outstanding_extents counter properly when dio write is splitLiu Bo2017-01-03
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently how btrfs dio deals with split dio write is not good enough if dio write is split into several segments due to the lack of contiguous space, a large dio write like 'dd bs=1G count=1' can end up with incorrect outstanding_extents counter and endio would complain loudly with an assertion. This fixes the problem by compensating the outstanding_extents counter in inode if a large dio write gets split. Reported-by: Anand Jain <anand.jain@oracle.com> Tested-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
* | Merge uncontroversial parts of branch 'readlink' of ↵Linus Torvalds2016-12-17
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull partial readlink cleanups from Miklos Szeredi. This is the uncontroversial part of the readlink cleanup patch-set that simplifies the default readlink handling. Miklos and Al are still discussing the rest of the series. * git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: vfs: make generic_readlink() static vfs: remove ".readlink = generic_readlink" assignments vfs: default to generic_readlink() vfs: replace calling i_op->readlink with vfs_readlink() proc/self: use generic_readlink ecryptfs: use vfs_get_link() bad_inode: add missing i_op initializers
| * | vfs: remove ".readlink = generic_readlink" assignmentsMiklos Szeredi2016-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If .readlink == NULL implies generic_readlink(). Generated by: to_del="\.readlink.*=.*generic_readlink" for i in `git grep -l $to_del`; do sed -i "/$to_del"/d $i; done Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* | | Merge branch 'for-linus-4.10' of ↵Linus Torvalds2016-12-16
|\ \ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs updates from Chris Mason: "Jeff Mahoney and Dave Sterba have a really nice set of cleanups in here, and Christoph pitched in corrections/improvements to make btrfs use proper helpers for bio walking instead of doing it by hand. There are some key fixes as well, including some long standing bugs that took forever to track down in btrfs_drop_extents and during balance" * 'for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (77 commits) btrfs: limit async_work allocation and worker func duration Revert "Btrfs: adjust len of writes if following a preallocated extent" Btrfs: don't WARN() in btrfs_transaction_abort() for IO errors btrfs: opencode chunk locking, remove helpers btrfs: remove root parameter from transaction commit/end routines btrfs: split btrfs_wait_marked_extents into normal and tree log functions btrfs: take an fs_info directly when the root is not used otherwise btrfs: simplify btrfs_wait_cache_io prototype btrfs: convert extent-tree tracepoints to use fs_info btrfs: root->fs_info cleanup, access fs_info->delayed_root directly btrfs: root->fs_info cleanup, add fs_info convenience variables btrfs: root->fs_info cleanup, update_block_group{,flags} btrfs: root->fs_info cleanup, lock/unlock_chunks btrfs: root->fs_info cleanup, btrfs_calc_{trans,trunc}_metadata_size btrfs: pull node/sector/stripe sizes out of root and into fs_info btrfs: root->fs_info cleanup, io_ctl_init btrfs: root->fs_info cleanup, use fs_info->dev_root everywhere btrfs: struct reada_control.root -> reada_control.fs_info btrfs: struct btrfsic_state->root should be an fs_info btrfs: alloc_reserved_file_extent trace point should use extent_root ...
| * | Revert "Btrfs: adjust len of writes if following a preallocated extent"Chris Mason2016-12-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | This is exposing an existing deadlock between fsync and AIO. Until we have the deadlock fixed, I'm pulling this one out. This reverts commit a23eaa875f0f1d89eb866b8c9860e78273ff5daf. Signed-off-by: Chris Mason <clm@fb.com>
| * | btrfs: remove root parameter from transaction commit/end routinesJeff Mahoney2016-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now we only use the root parameter to print the root objectid in a tracepoint. We can use the root parameter from the transaction handle for that. It's also used to join the transaction with async commits, so we remove the comment that it's just for checking. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | btrfs: take an fs_info directly when the root is not used otherwiseJeff Mahoney2016-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | There are loads of functions in btrfs that accept a root parameter but only use it to obtain an fs_info pointer. Let's convert those to just accept an fs_info pointer directly. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | btrfs: root->fs_info cleanup, add fs_info convenience variablesJeff Mahoney2016-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | In routines where someptr->fs_info is referenced multiple times, we introduce a convenience variable. This makes the code considerably more readable. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | btrfs: root->fs_info cleanup, btrfs_calc_{trans,trunc}_metadata_sizeJeff Mahoney2016-12-06
| | | | | | | | | | | | | | | Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | btrfs: pull node/sector/stripe sizes out of root and into fs_infoJeff Mahoney2016-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We track the node sizes per-root, but they never vary from the values in the superblock. This patch messes with the 80-column style a bit, but subsequent patches to factor out root->fs_info into a convenience variable fix it up again. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | btrfs: call functions that always use the same root with fs_info insteadJeff Mahoney2016-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are many functions that are always called with the same root argument. Rather than passing the same root every time, we can pass an fs_info pointer instead and have the function get the root pointer itself. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | btrfs: don't access the bio directly in the direct I/O codeChristoph Hellwig2016-11-30
| | | | | | | | | | | | | | | | | | | | | | | | Just use bio_for_each_segment_all to iterate over all segments. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | btrfs: increment ctx->pos for every emitted or skipped dirent in readdirJeff Mahoney2016-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we process the last item in the leaf and hit an I/O error while reading the next leaf, we return -EIO without having adjusted the position. Since we have emitted dirents, getdents() will return the byte count to the user instead of the error. Subsequent callers will emit the last successful dirent again, and return -EIO again, with the same result. Callers loop forever. Instead, if we always increment ctx->pos after emitting or skipping the dirent, we'll be sure that we won't hit the same one again. When we go to process the next leaf, we won't have emitted any dirents and the -EIO will be returned to the user properly. We also don't need to track if we've emitted a dirent already or if we've changed the position yet. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | btrfs: remove old tree_root dirent processing in btrfs_real_readdir()Jeff Mahoney2016-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 3de4586c527 (Btrfs: Allow subvolumes and snapshots anywhere in the directory tree) introduced the current system of placing snapshots in the directory tree. It also introduced the behavior of creating the snapshot and then creating the directory entries for it. We've kept this code around for compatibility reasons, but it turns out that no file systems with the old tree_root based snapshots can be mounted on newer (>= 2009) kernels anyway. About a month after the above commit, commit 2a7108ad89e (Btrfs: rev the disk format for the inode compat and csum selection changes) landed, changing the superblock magic number. As a result, we know that we'll never encounter tree_root-based dirents or have to deal with skipping our own snapshot dirents. Since that also means that we're now only iterating over DIR_INDEX items, which only contain one directory entry per leaf item, we don't need to loop over the leaf item contents anymore either. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>