[kernel] r22249 - in dists/sid/linux/debian: . patches patches/bugfix/all patches/bugfix/x86

Ben Hutchings benh at moszumanska.debian.org
Mon Jan 12 03:09:51 UTC 2015


Author: benh
Date: Mon Jan 12 03:09:51 2015
New Revision: 22249

Log:
Add various security fixes

Added:
   dists/sid/linux/debian/patches/bugfix/all/batman-adv-calculate-extra-tail-size-based-on-queued.patch
   dists/sid/linux/debian/patches/bugfix/all/isofs-fix-infinite-looping-over-ce-entries.patch
   dists/sid/linux/debian/patches/bugfix/all/isofs-fix-unchecked-printing-of-er-records.patch
   dists/sid/linux/debian/patches/bugfix/all/keys-close-race-between-key-lookup-and-freeing.patch
   dists/sid/linux/debian/patches/bugfix/x86/x86-kvm-clear-paravirt_enabled-on-kvm-guests-for-esp.patch
   dists/sid/linux/debian/patches/bugfix/x86/x86-tls-validate-tls-entries-to-protect-espfix.patch
   dists/sid/linux/debian/patches/bugfix/x86/x86_64-switch_to-load-tls-descriptors-before-switchi.patch
Modified:
   dists/sid/linux/debian/changelog
   dists/sid/linux/debian/patches/series

Modified: dists/sid/linux/debian/changelog
==============================================================================
--- dists/sid/linux/debian/changelog	Mon Jan 12 02:52:47 2015	(r22248)
+++ dists/sid/linux/debian/changelog	Mon Jan 12 03:09:51 2015	(r22249)
@@ -104,6 +104,16 @@
   * [x86] ACPI / video: Run _BCL before deciding registering backlight
     (regression in 3.16) (Closes: #762285)
   * [amd64] Enable EFI_MIXED to support Bay Trail systems
+  * [x86] tls: Validate TLS entries to protect espfix (CVE-2014-8133)
+  * [x86] kvm: Clear paravirt_enabled on KVM guests for espfix32's benefit
+    (CVE-2014-8134)
+  * [amd64] switch_to(): Load TLS descriptors before switching DS and ES
+    (CVE-2014-9419)
+  * isofs: Fix infinite looping over CE entries (CVE-2014-9420)
+  * batman-adv: Calculate extra tail size based on queued fragments
+    (Closes: #774155) (CVE-2014-9428)
+  * KEYS: close race between key lookup and freeing (CVE-2014-9529)
+  * isofs: Fix unchecked printing of ER records (CVE-2014-9584)
 
   [ Ian Campbell ]
   * [armhf] Enable support for support OMAP5432 uEVM by enabling:

Added: dists/sid/linux/debian/patches/bugfix/all/batman-adv-calculate-extra-tail-size-based-on-queued.patch
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ dists/sid/linux/debian/patches/bugfix/all/batman-adv-calculate-extra-tail-size-based-on-queued.patch	Mon Jan 12 03:09:51 2015	(r22249)
@@ -0,0 +1,58 @@
+From: Sven Eckelmann <sven at narfation.org>
+Date: Sat, 20 Dec 2014 13:48:55 +0100
+Subject: batman-adv: Calculate extra tail size based on queued fragments
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+Origin: https://git.kernel.org/linus/5b6698b0e4a37053de35cc24ee695b98a7eb712b
+
+The fragmentation code was replaced in 610bfc6bc99bc83680d190ebc69359a05fc7f605
+("batman-adv: Receive fragmented packets and merge"). The new code provided a
+mostly unused parameter skb for the merging function. It is used inside the
+function to calculate the additionally needed skb tailroom. But instead of
+increasing its own tailroom, it is only increasing the tailroom of the first
+queued skb. This is not correct in some situations because the first queued
+entry can be a different one than the parameter.
+
+An observed problem was:
+
+1. packet with size 104, total_size 1464, fragno 1 was received
+   - packet is queued
+2. packet with size 1400, total_size 1464, fragno 0 was received
+   - packet is queued at the end of the list
+3. enough data was received and can be given to the merge function
+   (1464 == (1400 - 20) + (104 - 20))
+   - merge functions gets 1400 byte large packet as skb argument
+4. merge function gets first entry in queue (104 byte)
+   - stored as skb_out
+5. merge function calculates the required extra tail as total_size - skb->len
+   - pskb_expand_head tail of skb_out with 64 bytes
+6. merge function tries to squeeze the extra 1380 bytes from the second queued
+   skb (1400 byte aka skb parameter) in the 64 extra tail bytes of skb_out
+
+Instead calculate the extra required tail bytes for skb_out also using skb_out
+instead of using the parameter skb. The skb parameter is only used to get the
+total_size from the last received packet. This is also the total_size used to
+decide that all fragments were received.
+
+Reported-by: Philipp Psurek <philipp.psurek at gmail.com>
+Signed-off-by: Sven Eckelmann <sven at narfation.org>
+Acked-by: Martin Hundebøll <martin at hundeboll.net>
+Signed-off-by: David S. Miller <davem at davemloft.net>
+---
+ net/batman-adv/fragmentation.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/net/batman-adv/fragmentation.c b/net/batman-adv/fragmentation.c
+index fc1835c..8af3461 100644
+--- a/net/batman-adv/fragmentation.c
++++ b/net/batman-adv/fragmentation.c
+@@ -251,7 +251,7 @@ batadv_frag_merge_packets(struct hlist_head *chain, struct sk_buff *skb)
+ 	kfree(entry);
+ 
+ 	/* Make room for the rest of the fragments. */
+-	if (pskb_expand_head(skb_out, 0, size - skb->len, GFP_ATOMIC) < 0) {
++	if (pskb_expand_head(skb_out, 0, size - skb_out->len, GFP_ATOMIC) < 0) {
+ 		kfree_skb(skb_out);
+ 		skb_out = NULL;
+ 		goto free;

Added: dists/sid/linux/debian/patches/bugfix/all/isofs-fix-infinite-looping-over-ce-entries.patch
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ dists/sid/linux/debian/patches/bugfix/all/isofs-fix-infinite-looping-over-ce-entries.patch	Mon Jan 12 03:09:51 2015	(r22249)
@@ -0,0 +1,52 @@
+From: Jan Kara <jack at suse.cz>
+Date: Mon, 15 Dec 2014 14:22:46 +0100
+Subject: isofs: Fix infinite looping over CE entries
+Origin: https://git.kernel.org/linus/f54e18f1b831c92f6512d2eedb224cd63d607d3d
+
+Rock Ridge extensions define so called Continuation Entries (CE) which
+define where is further space with Rock Ridge data. Corrupted isofs
+image can contain arbitrarily long chain of these, including a one
+containing loop and thus causing kernel to end in an infinite loop when
+traversing these entries.
+
+Limit the traversal to 32 entries which should be more than enough space
+to store all the Rock Ridge data.
+
+Reported-by: P J P <ppandit at redhat.com>
+CC: stable at vger.kernel.org
+Signed-off-by: Jan Kara <jack at suse.cz>
+---
+ fs/isofs/rock.c | 6 ++++++
+ 1 file changed, 6 insertions(+)
+
+diff --git a/fs/isofs/rock.c b/fs/isofs/rock.c
+index f488bba..bb63254 100644
+--- a/fs/isofs/rock.c
++++ b/fs/isofs/rock.c
+@@ -30,6 +30,7 @@ struct rock_state {
+ 	int cont_size;
+ 	int cont_extent;
+ 	int cont_offset;
++	int cont_loops;
+ 	struct inode *inode;
+ };
+ 
+@@ -73,6 +74,9 @@ static void init_rock_state(struct rock_state *rs, struct inode *inode)
+ 	rs->inode = inode;
+ }
+ 
++/* Maximum number of Rock Ridge continuation entries */
++#define RR_MAX_CE_ENTRIES 32
++
+ /*
+  * Returns 0 if the caller should continue scanning, 1 if the scan must end
+  * and -ve on error.
+@@ -105,6 +109,8 @@ static int rock_continue(struct rock_state *rs)
+ 			goto out;
+ 		}
+ 		ret = -EIO;
++		if (++rs->cont_loops >= RR_MAX_CE_ENTRIES)
++			goto out;
+ 		bh = sb_bread(rs->inode->i_sb, rs->cont_extent);
+ 		if (bh) {
+ 			memcpy(rs->buffer, bh->b_data + rs->cont_offset,

Added: dists/sid/linux/debian/patches/bugfix/all/isofs-fix-unchecked-printing-of-er-records.patch
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ dists/sid/linux/debian/patches/bugfix/all/isofs-fix-unchecked-printing-of-er-records.patch	Mon Jan 12 03:09:51 2015	(r22249)
@@ -0,0 +1,30 @@
+From: Jan Kara <jack at suse.cz>
+Date: Thu, 18 Dec 2014 17:26:10 +0100
+Subject: isofs: Fix unchecked printing of ER records
+Origin: https://git.kernel.org/linus/4e2024624e678f0ebb916e6192bd23c1f9fdf696
+
+We didn't check length of rock ridge ER records before printing them.
+Thus corrupted isofs image can cause us to access and print some memory
+behind the buffer with obvious consequences.
+
+Reported-and-tested-by: Carl Henrik Lunde <chlunde at ping.uio.no>
+CC: stable at vger.kernel.org
+Signed-off-by: Jan Kara <jack at suse.cz>
+---
+ fs/isofs/rock.c | 3 +++
+ 1 file changed, 3 insertions(+)
+
+diff --git a/fs/isofs/rock.c b/fs/isofs/rock.c
+index bb63254..735d752 100644
+--- a/fs/isofs/rock.c
++++ b/fs/isofs/rock.c
+@@ -362,6 +362,9 @@ repeat:
+ 			rs.cont_size = isonum_733(rr->u.CE.size);
+ 			break;
+ 		case SIG('E', 'R'):
++			/* Invalid length of ER tag id? */
++			if (rr->u.ER.len_id + offsetof(struct rock_ridge, u.ER.data) > rr->len)
++				goto out;
+ 			ISOFS_SB(inode->i_sb)->s_rock = 1;
+ 			printk(KERN_DEBUG "ISO 9660 Extensions: ");
+ 			{

Added: dists/sid/linux/debian/patches/bugfix/all/keys-close-race-between-key-lookup-and-freeing.patch
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ dists/sid/linux/debian/patches/bugfix/all/keys-close-race-between-key-lookup-and-freeing.patch	Mon Jan 12 03:09:51 2015	(r22249)
@@ -0,0 +1,42 @@
+From: Sasha Levin <sasha.levin at oracle.com>
+Date: Mon, 29 Dec 2014 09:39:01 -0500
+Subject: KEYS: close race between key lookup and freeing
+Origin: https://git.kernel.org/linus/a3a8784454692dd72e5d5d34dcdab17b4420e74c
+
+When a key is being garbage collected, it's key->user would get put before
+the ->destroy() callback is called, where the key is removed from it's
+respective tracking structures.
+
+This leaves a key hanging in a semi-invalid state which leaves a window open
+for a different task to try an access key->user. An example is
+find_keyring_by_name() which would dereference key->user for a key that is
+in the process of being garbage collected (where key->user was freed but
+->destroy() wasn't called yet - so it's still present in the linked list).
+
+This would cause either a panic, or corrupt memory.
+
+Fixes CVE-2014-9529.
+
+Signed-off-by: Sasha Levin <sasha.levin at oracle.com>
+Signed-off-by: David Howells <dhowells at redhat.com>
+---
+ security/keys/gc.c | 4 ++--
+ 1 file changed, 2 insertions(+), 2 deletions(-)
+
+--- a/security/keys/gc.c
++++ b/security/keys/gc.c
+@@ -157,12 +157,12 @@ static noinline void key_gc_unused_keys(
+ 		if (test_bit(KEY_FLAG_INSTANTIATED, &key->flags))
+ 			atomic_dec(&key->user->nikeys);
+ 
+-		key_user_put(key->user);
+-
+ 		/* now throw away the key memory */
+ 		if (key->type->destroy)
+ 			key->type->destroy(key);
+ 
++		key_user_put(key->user);
++
+ 		kfree(key->description);
+ 
+ #ifdef KEY_DEBUGGING

Added: dists/sid/linux/debian/patches/bugfix/x86/x86-kvm-clear-paravirt_enabled-on-kvm-guests-for-esp.patch
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ dists/sid/linux/debian/patches/bugfix/x86/x86-kvm-clear-paravirt_enabled-on-kvm-guests-for-esp.patch	Mon Jan 12 03:09:51 2015	(r22249)
@@ -0,0 +1,63 @@
+From: Andy Lutomirski <luto at amacapital.net>
+Date: Fri, 5 Dec 2014 19:03:28 -0800
+Subject: x86, kvm: Clear paravirt_enabled on KVM guests for espfix32's benefit
+Origin: https://git.kernel.org/linus/29fa6825463c97e5157284db80107d1bfac5d77b
+
+paravirt_enabled has the following effects:
+
+ - Disables the F00F bug workaround warning.  There is no F00F bug
+   workaround any more because Linux's standard IDT handling already
+   works around the F00F bug, but the warning still exists.  This
+   is only cosmetic, and, in any event, there is no such thing as
+   KVM on a CPU with the F00F bug.
+
+ - Disables 32-bit APM BIOS detection.  On a KVM paravirt system,
+   there should be no APM BIOS anyway.
+
+ - Disables tboot.  I think that the tboot code should check the
+   CPUID hypervisor bit directly if it matters.
+
+ - paravirt_enabled disables espfix32.  espfix32 should *not* be
+   disabled under KVM paravirt.
+
+The last point is the purpose of this patch.  It fixes a leak of the
+high 16 bits of the kernel stack address on 32-bit KVM paravirt
+guests.  Fixes CVE-2014-8134.
+
+Cc: stable at vger.kernel.org
+Suggested-by: Konrad Rzeszutek Wilk <konrad.wilk at oracle.com>
+Signed-off-by: Andy Lutomirski <luto at amacapital.net>
+Signed-off-by: Paolo Bonzini <pbonzini at redhat.com>
+---
+ arch/x86/kernel/kvm.c      | 9 ++++++++-
+ arch/x86/kernel/kvmclock.c | 1 -
+ 2 files changed, 8 insertions(+), 2 deletions(-)
+
+--- a/arch/x86/kernel/kvm.c
++++ b/arch/x86/kernel/kvm.c
+@@ -282,7 +282,14 @@ NOKPROBE_SYMBOL(do_async_page_fault);
+ static void __init paravirt_ops_setup(void)
+ {
+ 	pv_info.name = "KVM";
+-	pv_info.paravirt_enabled = 1;
++
++	/*
++	 * KVM isn't paravirt in the sense of paravirt_enabled.  A KVM
++	 * guest kernel works like a bare metal kernel with additional
++	 * features, and paravirt_enabled is about features that are
++	 * missing.
++	 */
++	pv_info.paravirt_enabled = 0;
+ 
+ 	if (kvm_para_has_feature(KVM_FEATURE_NOP_IO_DELAY))
+ 		pv_cpu_ops.io_delay = kvm_io_delay;
+--- a/arch/x86/kernel/kvmclock.c
++++ b/arch/x86/kernel/kvmclock.c
+@@ -263,7 +263,6 @@ void __init kvmclock_init(void)
+ #endif
+ 	kvm_get_preset_lpj();
+ 	clocksource_register_hz(&kvm_clock, NSEC_PER_SEC);
+-	pv_info.paravirt_enabled = 1;
+ 	pv_info.name = "KVM";
+ 
+ 	if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE_STABLE_BIT))

Added: dists/sid/linux/debian/patches/bugfix/x86/x86-tls-validate-tls-entries-to-protect-espfix.patch
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ dists/sid/linux/debian/patches/bugfix/x86/x86-tls-validate-tls-entries-to-protect-espfix.patch	Mon Jan 12 03:09:51 2015	(r22249)
@@ -0,0 +1,75 @@
+From: Andy Lutomirski <luto at amacapital.net>
+Date: Thu, 4 Dec 2014 16:48:16 -0800
+Subject: x86/tls: Validate TLS entries to protect espfix
+Origin: https://git.kernel.org/linus/41bdc78544b8a93a9c6814b8bbbfef966272abbe
+
+Installing a 16-bit RW data segment into the GDT defeats espfix.
+AFAICT this will not affect glibc, Wine, or dosemu at all.
+
+Signed-off-by: Andy Lutomirski <luto at amacapital.net>
+Acked-by: H. Peter Anvin <hpa at zytor.com>
+Cc: stable at vger.kernel.org
+Cc: Konrad Rzeszutek Wilk <konrad.wilk at oracle.com>
+Cc: Linus Torvalds <torvalds at linux-foundation.org>
+Cc: security at kernel.org <security at kernel.org>
+Cc: Willy Tarreau <w at 1wt.eu>
+Signed-off-by: Ingo Molnar <mingo at kernel.org>
+---
+ arch/x86/kernel/tls.c | 23 +++++++++++++++++++++++
+ 1 file changed, 23 insertions(+)
+
+diff --git a/arch/x86/kernel/tls.c b/arch/x86/kernel/tls.c
+index f7fec09..e7650bd 100644
+--- a/arch/x86/kernel/tls.c
++++ b/arch/x86/kernel/tls.c
+@@ -27,6 +27,21 @@ static int get_free_idx(void)
+ 	return -ESRCH;
+ }
+ 
++static bool tls_desc_okay(const struct user_desc *info)
++{
++	if (LDT_empty(info))
++		return true;
++
++	/*
++	 * espfix is required for 16-bit data segments, but espfix
++	 * only works for LDT segments.
++	 */
++	if (!info->seg_32bit)
++		return false;
++
++	return true;
++}
++
+ static void set_tls_desc(struct task_struct *p, int idx,
+ 			 const struct user_desc *info, int n)
+ {
+@@ -66,6 +81,9 @@ int do_set_thread_area(struct task_struct *p, int idx,
+ 	if (copy_from_user(&info, u_info, sizeof(info)))
+ 		return -EFAULT;
+ 
++	if (!tls_desc_okay(&info))
++		return -EINVAL;
++
+ 	if (idx == -1)
+ 		idx = info.entry_number;
+ 
+@@ -192,6 +210,7 @@ int regset_tls_set(struct task_struct *target, const struct user_regset *regset,
+ {
+ 	struct user_desc infobuf[GDT_ENTRY_TLS_ENTRIES];
+ 	const struct user_desc *info;
++	int i;
+ 
+ 	if (pos >= GDT_ENTRY_TLS_ENTRIES * sizeof(struct user_desc) ||
+ 	    (pos % sizeof(struct user_desc)) != 0 ||
+@@ -205,6 +224,10 @@ int regset_tls_set(struct task_struct *target, const struct user_regset *regset,
+ 	else
+ 		info = infobuf;
+ 
++	for (i = 0; i < count / sizeof(struct user_desc); i++)
++		if (!tls_desc_okay(info + i))
++			return -EINVAL;
++
+ 	set_tls_desc(target,
+ 		     GDT_ENTRY_TLS_MIN + (pos / sizeof(struct user_desc)),
+ 		     info, count / sizeof(struct user_desc));

Added: dists/sid/linux/debian/patches/bugfix/x86/x86_64-switch_to-load-tls-descriptors-before-switchi.patch
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ dists/sid/linux/debian/patches/bugfix/x86/x86_64-switch_to-load-tls-descriptors-before-switchi.patch	Mon Jan 12 03:09:51 2015	(r22249)
@@ -0,0 +1,304 @@
+From: Andy Lutomirski <luto at amacapital.net>
+Date: Mon, 8 Dec 2014 13:55:20 -0800
+Subject: x86_64, switch_to(): Load TLS descriptors before switching DS and ES
+Origin: https://git.kernel.org/linus/f647d7c155f069c1a068030255c300663516420e
+
+Otherwise, if buggy user code points DS or ES into the TLS
+array, they would be corrupted after a context switch.
+
+This also significantly improves the comments and documents some
+gotchas in the code.
+
+Before this patch, the both tests below failed.  With this
+patch, the es test passes, although the gsbase test still fails.
+
+ ----- begin es test -----
+
+/*
+ * Copyright (c) 2014 Andy Lutomirski
+ * GPL v2
+ */
+
+static unsigned short GDT3(int idx)
+{
+	return (idx << 3) | 3;
+}
+
+static int create_tls(int idx, unsigned int base)
+{
+	struct user_desc desc = {
+		.entry_number    = idx,
+		.base_addr       = base,
+		.limit           = 0xfffff,
+		.seg_32bit       = 1,
+		.contents        = 0, /* Data, grow-up */
+		.read_exec_only  = 0,
+		.limit_in_pages  = 1,
+		.seg_not_present = 0,
+		.useable         = 0,
+	};
+
+	if (syscall(SYS_set_thread_area, &desc) != 0)
+		err(1, "set_thread_area");
+
+	return desc.entry_number;
+}
+
+int main()
+{
+	int idx = create_tls(-1, 0);
+	printf("Allocated GDT index %d\n", idx);
+
+	unsigned short orig_es;
+	asm volatile ("mov %%es,%0" : "=rm" (orig_es));
+
+	int errors = 0;
+	int total = 1000;
+	for (int i = 0; i < total; i++) {
+		asm volatile ("mov %0,%%es" : : "rm" (GDT3(idx)));
+		usleep(100);
+
+		unsigned short es;
+		asm volatile ("mov %%es,%0" : "=rm" (es));
+		asm volatile ("mov %0,%%es" : : "rm" (orig_es));
+		if (es != GDT3(idx)) {
+			if (errors == 0)
+				printf("[FAIL]\tES changed from 0x%hx to 0x%hx\n",
+				       GDT3(idx), es);
+			errors++;
+		}
+	}
+
+	if (errors) {
+		printf("[FAIL]\tES was corrupted %d/%d times\n", errors, total);
+		return 1;
+	} else {
+		printf("[OK]\tES was preserved\n");
+		return 0;
+	}
+}
+
+ ----- end es test -----
+
+ ----- begin gsbase test -----
+
+/*
+ * gsbase.c, a gsbase test
+ * Copyright (c) 2014 Andy Lutomirski
+ * GPL v2
+ */
+
+static unsigned char *testptr, *testptr2;
+
+static unsigned char read_gs_testvals(void)
+{
+	unsigned char ret;
+	asm volatile ("movb %%gs:%1, %0" : "=r" (ret) : "m" (*testptr));
+	return ret;
+}
+
+int main()
+{
+	int errors = 0;
+
+	testptr = mmap((void *)0x200000000UL, 1, PROT_READ | PROT_WRITE,
+		       MAP_PRIVATE | MAP_FIXED | MAP_ANONYMOUS, -1, 0);
+	if (testptr == MAP_FAILED)
+		err(1, "mmap");
+
+	testptr2 = mmap((void *)0x300000000UL, 1, PROT_READ | PROT_WRITE,
+		       MAP_PRIVATE | MAP_FIXED | MAP_ANONYMOUS, -1, 0);
+	if (testptr2 == MAP_FAILED)
+		err(1, "mmap");
+
+	*testptr = 0;
+	*testptr2 = 1;
+
+	if (syscall(SYS_arch_prctl, ARCH_SET_GS,
+		    (unsigned long)testptr2 - (unsigned long)testptr) != 0)
+		err(1, "ARCH_SET_GS");
+
+	usleep(100);
+
+	if (read_gs_testvals() == 1) {
+		printf("[OK]\tARCH_SET_GS worked\n");
+	} else {
+		printf("[FAIL]\tARCH_SET_GS failed\n");
+		errors++;
+	}
+
+	asm volatile ("mov %0,%%gs" : : "r" (0));
+
+	if (read_gs_testvals() == 0) {
+		printf("[OK]\tWriting 0 to gs worked\n");
+	} else {
+		printf("[FAIL]\tWriting 0 to gs failed\n");
+		errors++;
+	}
+
+	usleep(100);
+
+	if (read_gs_testvals() == 0) {
+		printf("[OK]\tgsbase is still zero\n");
+	} else {
+		printf("[FAIL]\tgsbase was corrupted\n");
+		errors++;
+	}
+
+	return errors == 0 ? 0 : 1;
+}
+
+ ----- end gsbase test -----
+
+Signed-off-by: Andy Lutomirski <luto at amacapital.net>
+Cc: <stable at vger.kernel.org>
+Cc: Andi Kleen <andi at firstfloor.org>
+Cc: Linus Torvalds <torvalds at linux-foundation.org>
+Link: http://lkml.kernel.org/r/509d27c9fec78217691c3dad91cec87e1006b34a.1418075657.git.luto@amacapital.net
+Signed-off-by: Ingo Molnar <mingo at kernel.org>
+---
+ arch/x86/kernel/process_64.c | 101 +++++++++++++++++++++++++++++++------------
+ 1 file changed, 73 insertions(+), 28 deletions(-)
+
+--- a/arch/x86/kernel/process_64.c
++++ b/arch/x86/kernel/process_64.c
+@@ -286,24 +286,9 @@ __switch_to(struct task_struct *prev_p,
+ 
+ 	fpu = switch_fpu_prepare(prev_p, next_p, cpu);
+ 
+-	/*
+-	 * Reload esp0, LDT and the page table pointer:
+-	 */
++	/* Reload esp0 and ss1. */
+ 	load_sp0(tss, next);
+ 
+-	/*
+-	 * Switch DS and ES.
+-	 * This won't pick up thread selector changes, but I guess that is ok.
+-	 */
+-	savesegment(es, prev->es);
+-	if (unlikely(next->es | prev->es))
+-		loadsegment(es, next->es);
+-
+-	savesegment(ds, prev->ds);
+-	if (unlikely(next->ds | prev->ds))
+-		loadsegment(ds, next->ds);
+-
+-
+ 	/* We must save %fs and %gs before load_TLS() because
+ 	 * %fs and %gs may be cleared by load_TLS().
+ 	 *
+@@ -312,41 +297,101 @@ __switch_to(struct task_struct *prev_p,
+ 	savesegment(fs, fsindex);
+ 	savesegment(gs, gsindex);
+ 
++	/*
++	 * Load TLS before restoring any segments so that segment loads
++	 * reference the correct GDT entries.
++	 */
+ 	load_TLS(next, cpu);
+ 
+ 	/*
+-	 * Leave lazy mode, flushing any hypercalls made here.
+-	 * This must be done before restoring TLS segments so
+-	 * the GDT and LDT are properly updated, and must be
+-	 * done before math_state_restore, so the TS bit is up
+-	 * to date.
++	 * Leave lazy mode, flushing any hypercalls made here.  This
++	 * must be done after loading TLS entries in the GDT but before
++	 * loading segments that might reference them, and and it must
++	 * be done before math_state_restore, so the TS bit is up to
++	 * date.
+ 	 */
+ 	arch_end_context_switch(next_p);
+ 
++	/* Switch DS and ES.
++	 *
++	 * Reading them only returns the selectors, but writing them (if
++	 * nonzero) loads the full descriptor from the GDT or LDT.  The
++	 * LDT for next is loaded in switch_mm, and the GDT is loaded
++	 * above.
++	 *
++	 * We therefore need to write new values to the segment
++	 * registers on every context switch unless both the new and old
++	 * values are zero.
++	 *
++	 * Note that we don't need to do anything for CS and SS, as
++	 * those are saved and restored as part of pt_regs.
++	 */
++	savesegment(es, prev->es);
++	if (unlikely(next->es | prev->es))
++		loadsegment(es, next->es);
++
++	savesegment(ds, prev->ds);
++	if (unlikely(next->ds | prev->ds))
++		loadsegment(ds, next->ds);
++
+ 	/*
+ 	 * Switch FS and GS.
+ 	 *
+-	 * Segment register != 0 always requires a reload.  Also
+-	 * reload when it has changed.  When prev process used 64bit
+-	 * base always reload to avoid an information leak.
++	 * These are even more complicated than FS and GS: they have
++	 * 64-bit bases are that controlled by arch_prctl.  Those bases
++	 * only differ from the values in the GDT or LDT if the selector
++	 * is 0.
++	 *
++	 * Loading the segment register resets the hidden base part of
++	 * the register to 0 or the value from the GDT / LDT.  If the
++	 * next base address zero, writing 0 to the segment register is
++	 * much faster than using wrmsr to explicitly zero the base.
++	 *
++	 * The thread_struct.fs and thread_struct.gs values are 0
++	 * if the fs and gs bases respectively are not overridden
++	 * from the values implied by fsindex and gsindex.  They
++	 * are nonzero, and store the nonzero base addresses, if
++	 * the bases are overridden.
++	 *
++	 * (fs != 0 && fsindex != 0) || (gs != 0 && gsindex != 0) should
++	 * be impossible.
++	 *
++	 * Therefore we need to reload the segment registers if either
++	 * the old or new selector is nonzero, and we need to override
++	 * the base address if next thread expects it to be overridden.
++	 *
++	 * This code is unnecessarily slow in the case where the old and
++	 * new indexes are zero and the new base is nonzero -- it will
++	 * unnecessarily write 0 to the selector before writing the new
++	 * base address.
++	 *
++	 * Note: This all depends on arch_prctl being the only way that
++	 * user code can override the segment base.  Once wrfsbase and
++	 * wrgsbase are enabled, most of this code will need to change.
+ 	 */
+ 	if (unlikely(fsindex | next->fsindex | prev->fs)) {
+ 		loadsegment(fs, next->fsindex);
++
+ 		/*
+-		 * Check if the user used a selector != 0; if yes
+-		 *  clear 64bit base, since overloaded base is always
+-		 *  mapped to the Null selector
++		 * If user code wrote a nonzero value to FS, then it also
++		 * cleared the overridden base address.
++		 *
++		 * XXX: if user code wrote 0 to FS and cleared the base
++		 * address itself, we won't notice and we'll incorrectly
++		 * restore the prior base address next time we reschdule
++		 * the process.
+ 		 */
+ 		if (fsindex)
+ 			prev->fs = 0;
+ 	}
+-	/* when next process has a 64bit base use it */
+ 	if (next->fs)
+ 		wrmsrl(MSR_FS_BASE, next->fs);
+ 	prev->fsindex = fsindex;
+ 
+ 	if (unlikely(gsindex | next->gsindex | prev->gs)) {
+ 		load_gs_index(next->gsindex);
++
++		/* This works (and fails) the same way as fsindex above. */
+ 		if (gsindex)
+ 			prev->gs = 0;
+ 	}

Modified: dists/sid/linux/debian/patches/series
==============================================================================
--- dists/sid/linux/debian/patches/series	Mon Jan 12 02:52:47 2015	(r22248)
+++ dists/sid/linux/debian/patches/series	Mon Jan 12 03:09:51 2015	(r22249)
@@ -484,3 +484,10 @@
 debian/pci-fix-abi-change-in-3.16.7-ckt3.patch
 features/x86/platform-chrome-chromeos_laptop-add-support-for-acer.patch
 bugfix/x86/acpi-video-run-_bcl-before-deciding-registering-back.patch
+bugfix/x86/x86-tls-validate-tls-entries-to-protect-espfix.patch
+bugfix/x86/x86-kvm-clear-paravirt_enabled-on-kvm-guests-for-esp.patch
+bugfix/x86/x86_64-switch_to-load-tls-descriptors-before-switchi.patch
+bugfix/all/isofs-fix-infinite-looping-over-ce-entries.patch
+bugfix/all/batman-adv-calculate-extra-tail-size-based-on-queued.patch
+bugfix/all/keys-close-race-between-key-lookup-and-freeing.patch
+bugfix/all/isofs-fix-unchecked-printing-of-er-records.patch



More information about the Kernel-svn-changes mailing list