[go: nahoru, domu]

Bug 103100 - [11 Regression] unaligned access generated with memset or {} and -O2 -mstrict-align
Summary: [11 Regression] unaligned access generated with memset or {} and -O2 -mstrict...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 11.2.0
: P2 normal
Target Milestone: 12.5
Assignee: Wilco
URL:
Keywords: wrong-code
: 105886 109273 (view as bug list)
Depends on:
Blocks: 105886
  Show dependency treegraph
 
Reported: 2021-11-05 12:52 UTC by felix
Modified: 2024-07-05 13:07 UTC (History)
6 users (show)

See Also:
Host:
Target: aarch64
Build:
Known to work: 14.0
Known to fail: 13.3.1
Last reconfirmed: 2021-11-05 00:00:00


Attachments
source code that generates the faulty assembly (97 bytes, text/plain)
2021-11-05 12:52 UTC, felix
Details
New simplier patch which does the right thing too (911 bytes, patch)
2021-11-18 08:50 UTC, Andrew Pinski
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description felix 2021-11-05 12:52:19 UTC
Created attachment 51738 [details]
source code that generates the faulty assembly

when zero-intializing large local variables, gcc 11.2 (with -O2 and -O3) uses SIMD registers to store a pair of 16-byte registers at once into memory. When doing so, gcc can generate code that does not access memory on a 16-byte aligned boundary, even though the aarch64 architecture requires memory accesses to be 16-byte aligned when using the full 16-byte SIMD registers. This happens with -mstrict-align enabled.

For example:

static void (*use)(unsigned char*); // to suppress optimizations

extern "C" void _start() {
    unsigned char t2[216]={};
    use(t2);
}

when compiled with "gcc -save-temps -O2 -mstrict-align" generates the following assembly:
_start:
	stp	x29, x30, [sp, #-240]!// assuming sp is aligned to 16-bytes here 
	mov	x1, #0x0
	movi	v0.4s, #0x0
	add	x2, sp, #0x28 // the value in x2 is 8-byte aligned, but not 
                                 16-byte aligned
	mov	x29, sp
	stp	xzr, xzr, [sp, #24]
	add	x0, sp, #0x18
	stp	q0, q0, [x2] // x2 is not 16-byte aligned, so the store is not
                                aligned
	add	x2, sp, #0x48
	str	xzr, [sp, #232]
	stp	q0, q0, [x2]
	add	x2, sp, #0x68
	stp	q0, q0, [x2]
	add	x2, sp, #0x88
	stp	q0, q0, [x2]
	add	x2, sp, #0xa8
	stp	q0, q0, [x2]
	add	x2, sp, #0xc8
	stp	q0, q0, [x2]
	blr	x1
	ldp	x29, x30, [sp], #240
	ret

I have seen https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71727 and even though that is marked as fixed, this issue persists in gcc 11.2
Comment 1 Andrew Pinski 2021-11-05 12:58:08 UTC
Dup of bug 101934, fixed for gcc 11.3.0.

*** This bug has been marked as a duplicate of bug 101934 ***
Comment 2 Andrew Pinski 2021-11-05 13:17:59 UTC
Actually it is not a dup.
Comment 3 Andrew Pinski 2021-11-05 13:21:34 UTC
Confirmed, another simplier testcase:
void use(unsigned char*);

void g() {
    unsigned char t2[216];
    __builtin_memset(t2, 0, sizeof(t2));
    use(t2);
}
Comment 4 Andrew Pinski 2021-11-05 13:26:46 UTC
Mine.
This should fix it (untested):
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 699c105a42a..5d0872be4ef 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -23584,7 +23584,9 @@ aarch64_expand_setmem (rtx *operands)
         over writing.  */
       opt_scalar_int_mode mode_iter;
       FOR_EACH_MODE_IN_CLASS (mode_iter, MODE_INT)
-       if (GET_MODE_BITSIZE (mode_iter.require ()) <= MIN (n, copy_limit))
+       if (GET_MODE_BITSIZE (mode_iter.require ()) <= MIN (n, copy_limit)
+           && (!STRICT_ALIGNMENT
+               || MEM_ALIGN (dst) >= GET_MODE_ALIGNMENT (mode_iter.require ())))
          cur_mode = mode_iter.require ();

       gcc_assert (cur_mode != BLKmode);
Comment 5 Andrew Pinski 2021-11-05 13:27:24 UTC
Also it was caused by r11-4973.
Comment 6 Andrew Pinski 2021-11-05 13:38:19 UTC
(In reply to Andrew Pinski from comment #4)
> Mine.
> This should fix it (untested):
And yes it works.
Comment 7 Andrew Pinski 2021-11-06 05:26:36 UTC
Patch submitted:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583554.html
Comment 8 Andrew Pinski 2021-11-18 08:50:12 UTC
Created attachment 51830 [details]
New simplier patch which does the right thing too
Comment 9 Andrew Pinski 2021-11-19 01:55:11 UTC
New patch submitted:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584927.html
Comment 10 Andrew Pinski 2022-01-26 02:30:17 UTC
Updated patch submitted:
https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589254.html
Comment 11 Richard Biener 2022-04-21 07:50:44 UTC
GCC 11.3 is being released, retargeting bugs to GCC 11.4.
Comment 12 Andrew Pinski 2022-06-09 10:28:59 UTC
*** Bug 105886 has been marked as a duplicate of this bug. ***
Comment 13 Richard Biener 2023-01-19 13:28:47 UTC
(In reply to Andrew Pinski from comment #10)
> Updated patch submitted:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589254.html

I think you need to ping your patches more aggressively ...
Comment 14 Richard Earnshaw 2023-01-20 13:59:58 UTC
(In reply to Richard Biener from comment #13)
> (In reply to Andrew Pinski from comment #10)
> > Updated patch submitted:
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589254.html
> 
> I think you need to ping your patches more aggressively ...

Richard Sandiford reviewed it here:| https://gcc.gnu.org/pipermail/gcc-patches/2022-February/589581.html
So the problem is that the review wasn't followed up by the submitter.
Comment 15 felix 2023-01-23 07:29:12 UTC
(In reply to Richard Earnshaw from comment #14)
> (In reply to Richard Biener from comment #13)
> > (In reply to Andrew Pinski from comment #10)
> > > Updated patch submitted:
> > > https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589254.html
> > 
> > I think you need to ping your patches more aggressively ...
> 
> Richard Sandiford reviewed it here:|
> https://gcc.gnu.org/pipermail/gcc-patches/2022-February/589581.html
> So the problem is that the review wasn't followed up by the submitter.

I did not know that I have any further obligation on this past submitting the bug, I never submitted a bug before. Anyway, the since the patch works (at least for my use case), do I have to mark this as resolved, fixed?
Comment 16 Sam James 2023-01-23 07:37:04 UTC
(In reply to felix from comment #15)

He means apinski who submitted a patch, not you.
Comment 17 Andrew Pinski 2023-01-31 17:49:10 UTC
Another testcase which is now affecting us at Marvell in our early firmware:
```
void f(const char*);

void g(void)
{
  char t[32] = "0l2345678";
  f(t);
}

```
So I am planning on getting back on this patch starting today.
Comment 18 Andrew Pinski 2023-01-31 17:51:23 UTC
(In reply to Andrew Pinski from comment #17)
> Another testcase which is now affecting us at Marvell in our early firmware:
> ```
> void f(const char*);
> 
> void g(void)
> {
>   char t[32] = "0l2345678";
>   f(t);
> }
> 
> ```
> So I am planning on getting back on this patch starting today.

I should say that testcase happens at `-Os -mstrict-align`, at `-O2 -mstrict-align` it works.
Comment 19 Richard Earnshaw 2023-01-31 18:24:42 UTC
(In reply to Andrew Pinski from comment #18)
> I should say that testcase happens at `-Os -mstrict-align`, at `-O2
> -mstrict-align` it works.

Because for -Os we don't forcibly align arrays - see 
AARCH64_EXPAND_ALIGNMENT and the macros that use it.
Comment 20 Andrew Pinski 2023-02-10 03:54:10 UTC
Submitted a version new patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611684.html
Comment 21 Andrew Pinski 2023-03-24 12:09:04 UTC
*** Bug 109273 has been marked as a duplicate of this bug. ***
Comment 22 Andrew Pinski 2023-04-05 19:25:37 UTC
I am no longer working on this.
Comment 23 Jakub Jelinek 2023-05-29 10:06:02 UTC
GCC 11.4 is being released, retargeting bugs to GCC 11.5.
Comment 24 Wilco 2023-09-20 13:53:35 UTC
Patch to avoid emitting unaligned LDP/STP with -mstrict-align: https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631022.html
Comment 25 GCC Commits 2023-11-30 13:49:32 UTC
The master branch has been updated by Wilco Dijkstra <wilco@gcc.gnu.org>:

https://gcc.gnu.org/g:318f5232cfb3e0c9694889565e1f5424d0354463

commit r14-6012-g318f5232cfb3e0c9694889565e1f5424d0354463
Author: Wilco Dijkstra <wilco.dijkstra@arm.com>
Date:   Wed Oct 25 16:28:04 2023 +0100

    AArch64: Fix strict-align cpymem/setmem [PR103100]
    
    The cpymemdi/setmemdi implementation doesn't fully support strict alignment.
    Block the expansion if the alignment is less than 16 with STRICT_ALIGNMENT.
    Clean up the condition when to use MOPS.
    
    gcc/ChangeLog/
            PR target/103100
            * config/aarch64/aarch64.md (cpymemdi): Remove pattern condition.
            (setmemdi): Likewise.
            * config/aarch64/aarch64.cc (aarch64_expand_cpymem): Support
            strict-align.  Cleanup condition for using MOPS.
            (aarch64_expand_setmem): Likewise.
Comment 26 Hans-Peter Nilsson 2024-06-26 23:46:02 UTC
(In reply to GCC Commits from comment #25)
> The master branch has been updated by Wilco Dijkstra <wilco@gcc.gnu.org>:

Are you considering back-porting this to other branches (like gcc-13) or is there another reason this PR is still open?  The commit applies cleanly there, but is untested.   Asking for a friend. :)
Comment 27 GCC Commits 2024-06-27 17:23:42 UTC
The releases/gcc-13 branch has been updated by Wilco Dijkstra <wilco@gcc.gnu.org>:

https://gcc.gnu.org/g:5aa9ed0f353f835005c3df8932c7bc6e26f53904

commit r13-8874-g5aa9ed0f353f835005c3df8932c7bc6e26f53904
Author: Wilco Dijkstra <wilco.dijkstra@arm.com>
Date:   Wed Oct 25 16:28:04 2023 +0100

    AArch64: Fix strict-align cpymem/setmem [PR103100]
    
    The cpymemdi/setmemdi implementation doesn't fully support strict alignment.
    Block the expansion if the alignment is less than 16 with STRICT_ALIGNMENT.
    Clean up the condition when to use MOPS.
    
    gcc/ChangeLog/
            PR target/103100
            * config/aarch64/aarch64.md (cpymemdi): Remove pattern condition.
            (setmemdi): Likewise.
            * config/aarch64/aarch64.cc (aarch64_expand_cpymem): Support
            strict-align.  Cleanup condition for using MOPS.
            (aarch64_expand_setmem): Likewise.
    
    (cherry picked from commit 318f5232cfb3e0c9694889565e1f5424d0354463)
Comment 28 Hans-Peter Nilsson 2024-06-27 23:13:33 UTC
(In reply to GCC Commits from comment #27)
> The releases/gcc-13 branch has been updated by Wilco Dijkstra
> <wilco@gcc.gnu.org>:

Thanks!
Comment 29 GCC Commits 2024-07-05 12:40:02 UTC
The releases/gcc-12 branch has been updated by Wilco Dijkstra <wilco@gcc.gnu.org>:

https://gcc.gnu.org/g:b9d16d8361a9e3a82a2f21e759e760d235d43322

commit r12-10603-gb9d16d8361a9e3a82a2f21e759e760d235d43322
Author: Wilco Dijkstra <wilco.dijkstra@arm.com>
Date:   Wed Oct 25 16:28:04 2023 +0100

    AArch64: Fix strict-align cpymem/setmem [PR103100]
    
    The cpymemdi/setmemdi implementation doesn't fully support strict alignment.
    Block the expansion if the alignment is less than 16 with STRICT_ALIGNMENT.
    Clean up the condition when to use MOPS.
    
    gcc/ChangeLog/
            PR target/103100
            * config/aarch64/aarch64.md (cpymemdi): Remove pattern condition.
            (setmemdi): Likewise.
            * config/aarch64/aarch64.cc (aarch64_expand_cpymem): Support
            strict-align.  Cleanup condition for using MOPS.
            (aarch64_expand_setmem): Likewise.
    
    (cherry picked from commit 318f5232cfb3e0c9694889565e1f5424d0354463)
Comment 30 Wilco 2024-07-05 13:02:55 UTC
Fixed on GCC12 branch too. It doesn't apply to GCC11, so it's unlikely to be worth fixing since GCC11 branch will be closed soon.