Skip to content

Commit a3d97d1

Browse files
committed
Merge tag 'ovl-fixes-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs
Pull overlayfs fixes from Amir Goldstein: - Fix regression in 'xino' feature detection I clumsily introduced this regression myself when working on another subsystem (fsnotify). Both the regression and the fix have almost no visible impact on users except for some kmsg prints. - Fix to performance regression in v6.12. This regression was reported by Google COS developers. It is not uncommon these days for the year-old mature LTS to get adopted by distros and get exposed to many new workloads. We made a sub-smart move of making a behavior change in v6.12 which could impact performance, without making it opt-in. Fixing this mistake retroactively, to be picked by LTS. * tag 'ovl-fixes-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs: ovl: make fsync after metadata copy-up opt-in mount option ovl: fix wrong detection of 32bit inode numbers
2 parents 241d4ca + 1f6ee9b commit a3d97d1

7 files changed

Lines changed: 108 additions & 16 deletions

File tree

Documentation/filesystems/overlayfs.rst

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -783,6 +783,56 @@ controlled by the "uuid" mount option, which supports these values:
783783
mounted with "uuid=on".
784784

785785

786+
Durability and copy up
787+
----------------------
788+
789+
The fsync(2) system call ensures that the data and metadata of a file
790+
are safely written to the backing storage, which is expected to
791+
guarantee the existence of the information post system crash.
792+
793+
Without an fsync(2) call, there is no guarantee that the observed
794+
data after a system crash will be either the old or the new data, but
795+
in practice, the observed data after crash is often the old or new data
796+
or a mix of both.
797+
798+
When an overlayfs file is modified for the first time, copy up will
799+
create a copy of the lower file and its parent directories in the upper
800+
layer. Since the Linux filesystem API does not enforce any particular
801+
ordering on storing changes without explicit fsync(2) calls, in case
802+
of a system crash, the upper file could end up with no data at all
803+
(i.e. zeros), which would be an unusual outcome. To avoid this
804+
experience, overlayfs calls fsync(2) on the upper file before completing
805+
data copy up with rename(2) or link(2) to make the copy up "atomic".
806+
807+
By default, overlayfs does not explicitly call fsync(2) on copied up
808+
directories or on metadata-only copy up, so it provides no guarantee to
809+
persist the user's modification unless the user calls fsync(2).
810+
The fsync during copy up only guarantees that if a copy up is observed
811+
after a crash, the observed data is not zeroes or intermediate values
812+
from the copy up staging area.
813+
814+
On traditional local filesystems with a single journal (e.g. ext4, xfs),
815+
fsync on a file also persists the parent directory changes, because they
816+
are usually modified in the same transaction, so metadata durability during
817+
data copy up effectively comes for free. Overlayfs further limits risk by
818+
disallowing network filesystems as upper layer.
819+
820+
Overlayfs can be tuned to prefer performance or durability when storing
821+
to the underlying upper layer. This is controlled by the "fsync" mount
822+
option, which supports these values:
823+
824+
- "auto": (default)
825+
Call fsync(2) on upper file before completion of data copy up.
826+
No explicit fsync(2) on directory or metadata-only copy up.
827+
- "strict":
828+
Call fsync(2) on upper file and directories before completion of any
829+
copy up.
830+
- "volatile": [*]
831+
Prefer performance over durability (see `Volatile mount`_)
832+
833+
[*] The mount option "volatile" is an alias to "fsync=volatile".
834+
835+
786836
Volatile mount
787837
--------------
788838

fs/overlayfs/copy_up.c

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1146,15 +1146,15 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
11461146
return -EOVERFLOW;
11471147

11481148
/*
1149-
* With metacopy disabled, we fsync after final metadata copyup, for
1149+
* With "fsync=strict", we fsync after final metadata copyup, for
11501150
* both regular files and directories to get atomic copyup semantics
11511151
* on filesystems that do not use strict metadata ordering (e.g. ubifs).
11521152
*
1153-
* With metacopy enabled we want to avoid fsync on all meta copyup
1153+
* By default, we want to avoid fsync on all meta copyup, because
11541154
* that will hurt performance of workloads such as chown -R, so we
11551155
* only fsync on data copyup as legacy behavior.
11561156
*/
1157-
ctx.metadata_fsync = !OVL_FS(dentry->d_sb)->config.metacopy &&
1157+
ctx.metadata_fsync = ovl_should_sync_metadata(OVL_FS(dentry->d_sb)) &&
11581158
(S_ISREG(ctx.stat.mode) || S_ISDIR(ctx.stat.mode));
11591159
ctx.metacopy = ovl_need_meta_copy_up(dentry, ctx.stat.mode, flags);
11601160

fs/overlayfs/overlayfs.h

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,12 @@ enum {
9999
OVL_VERITY_REQUIRE,
100100
};
101101

102+
enum {
103+
OVL_FSYNC_VOLATILE,
104+
OVL_FSYNC_AUTO,
105+
OVL_FSYNC_STRICT,
106+
};
107+
102108
/*
103109
* The tuple (fh,uuid) is a universal unique identifier for a copy up origin,
104110
* where:
@@ -656,6 +662,21 @@ static inline bool ovl_xino_warn(struct ovl_fs *ofs)
656662
return ofs->config.xino == OVL_XINO_ON;
657663
}
658664

665+
static inline bool ovl_should_sync(struct ovl_fs *ofs)
666+
{
667+
return ofs->config.fsync_mode != OVL_FSYNC_VOLATILE;
668+
}
669+
670+
static inline bool ovl_should_sync_metadata(struct ovl_fs *ofs)
671+
{
672+
return ofs->config.fsync_mode == OVL_FSYNC_STRICT;
673+
}
674+
675+
static inline bool ovl_is_volatile(struct ovl_config *config)
676+
{
677+
return config->fsync_mode == OVL_FSYNC_VOLATILE;
678+
}
679+
659680
/*
660681
* To avoid regressions in existing setups with overlay lower offline changes,
661682
* we allow lower changes only if none of the new features are used.

fs/overlayfs/ovl_entry.h

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ struct ovl_config {
1818
int xino;
1919
bool metacopy;
2020
bool userxattr;
21-
bool ovl_volatile;
21+
int fsync_mode;
2222
};
2323

2424
struct ovl_sb {
@@ -120,11 +120,6 @@ static inline struct ovl_fs *OVL_FS(struct super_block *sb)
120120
return (struct ovl_fs *)sb->s_fs_info;
121121
}
122122

123-
static inline bool ovl_should_sync(struct ovl_fs *ofs)
124-
{
125-
return !ofs->config.ovl_volatile;
126-
}
127-
128123
static inline unsigned int ovl_numlower(struct ovl_entry *oe)
129124
{
130125
return oe ? oe->__numlower : 0;

fs/overlayfs/params.c

Lines changed: 28 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,7 @@ enum ovl_opt {
5858
Opt_xino,
5959
Opt_metacopy,
6060
Opt_verity,
61+
Opt_fsync,
6162
Opt_volatile,
6263
Opt_override_creds,
6364
};
@@ -140,6 +141,23 @@ static int ovl_verity_mode_def(void)
140141
return OVL_VERITY_OFF;
141142
}
142143

144+
static const struct constant_table ovl_parameter_fsync[] = {
145+
{ "volatile", OVL_FSYNC_VOLATILE },
146+
{ "auto", OVL_FSYNC_AUTO },
147+
{ "strict", OVL_FSYNC_STRICT },
148+
{}
149+
};
150+
151+
static const char *ovl_fsync_mode(struct ovl_config *config)
152+
{
153+
return ovl_parameter_fsync[config->fsync_mode].name;
154+
}
155+
156+
static int ovl_fsync_mode_def(void)
157+
{
158+
return OVL_FSYNC_AUTO;
159+
}
160+
143161
const struct fs_parameter_spec ovl_parameter_spec[] = {
144162
fsparam_string_empty("lowerdir", Opt_lowerdir),
145163
fsparam_file_or_string("lowerdir+", Opt_lowerdir_add),
@@ -155,6 +173,7 @@ const struct fs_parameter_spec ovl_parameter_spec[] = {
155173
fsparam_enum("xino", Opt_xino, ovl_parameter_xino),
156174
fsparam_enum("metacopy", Opt_metacopy, ovl_parameter_bool),
157175
fsparam_enum("verity", Opt_verity, ovl_parameter_verity),
176+
fsparam_enum("fsync", Opt_fsync, ovl_parameter_fsync),
158177
fsparam_flag("volatile", Opt_volatile),
159178
fsparam_flag_no("override_creds", Opt_override_creds),
160179
{}
@@ -665,8 +684,11 @@ static int ovl_parse_param(struct fs_context *fc, struct fs_parameter *param)
665684
case Opt_verity:
666685
config->verity_mode = result.uint_32;
667686
break;
687+
case Opt_fsync:
688+
config->fsync_mode = result.uint_32;
689+
break;
668690
case Opt_volatile:
669-
config->ovl_volatile = true;
691+
config->fsync_mode = OVL_FSYNC_VOLATILE;
670692
break;
671693
case Opt_userxattr:
672694
config->userxattr = true;
@@ -800,6 +822,7 @@ int ovl_init_fs_context(struct fs_context *fc)
800822
ofs->config.nfs_export = ovl_nfs_export_def;
801823
ofs->config.xino = ovl_xino_def();
802824
ofs->config.metacopy = ovl_metacopy_def;
825+
ofs->config.fsync_mode = ovl_fsync_mode_def();
803826

804827
fc->s_fs_info = ofs;
805828
fc->fs_private = ctx;
@@ -870,9 +893,9 @@ int ovl_fs_params_verify(const struct ovl_fs_context *ctx,
870893
config->index = false;
871894
}
872895

873-
if (!config->upperdir && config->ovl_volatile) {
896+
if (!config->upperdir && ovl_is_volatile(config)) {
874897
pr_info("option \"volatile\" is meaningless in a non-upper mount, ignoring it.\n");
875-
config->ovl_volatile = false;
898+
config->fsync_mode = ovl_fsync_mode_def();
876899
}
877900

878901
if (!config->upperdir && config->uuid == OVL_UUID_ON) {
@@ -1070,8 +1093,8 @@ int ovl_show_options(struct seq_file *m, struct dentry *dentry)
10701093
seq_printf(m, ",xino=%s", ovl_xino_mode(&ofs->config));
10711094
if (ofs->config.metacopy != ovl_metacopy_def)
10721095
seq_printf(m, ",metacopy=%s", str_on_off(ofs->config.metacopy));
1073-
if (ofs->config.ovl_volatile)
1074-
seq_puts(m, ",volatile");
1096+
if (ofs->config.fsync_mode != ovl_fsync_mode_def())
1097+
seq_printf(m, ",fsync=%s", ovl_fsync_mode(&ofs->config));
10751098
if (ofs->config.userxattr)
10761099
seq_puts(m, ",userxattr");
10771100
if (ofs->config.verity_mode != ovl_verity_mode_def())

fs/overlayfs/super.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -776,7 +776,7 @@ static int ovl_make_workdir(struct super_block *sb, struct ovl_fs *ofs,
776776
* For volatile mount, create a incompat/volatile/dirty file to keep
777777
* track of it.
778778
*/
779-
if (ofs->config.ovl_volatile) {
779+
if (ovl_is_volatile(&ofs->config)) {
780780
err = ovl_create_volatile_dirty(ofs);
781781
if (err < 0) {
782782
pr_err("Failed to create volatile/dirty file.\n");

fs/overlayfs/util.c

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,10 @@ int ovl_can_decode_fh(struct super_block *sb)
8585
if (!exportfs_can_decode_fh(sb->s_export_op))
8686
return 0;
8787

88-
return sb->s_export_op->encode_fh ? -1 : FILEID_INO32_GEN;
88+
if (sb->s_export_op->encode_fh == generic_encode_ino32_fh)
89+
return FILEID_INO32_GEN;
90+
91+
return -1;
8992
}
9093

9194
struct dentry *ovl_indexdir(struct super_block *sb)

0 commit comments

Comments
 (0)