[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Gotos
>>>>> "Phil" == Phil White <cerise@littlegreenmen.armory.com> writes:
Phil> This is flame bait. As a result, I promise only one reply to
Phil> it, should I receive criticism. That way, it needn't annoy
Phil> everyone else very much ; )
You have rather strange (and IMHO quite unpopular) criteria for
classifying something as a flamebait.
Phil> Actually, if you try compiling that file with the patch and
Phil> without it, they do not turn out the same. I tried without -O,
Phil> with -O2, and with -O3.
Phil> In all cases (on my x86), it turns out in favor of the goto.
We've heard you opinion. Looking forward for the arguments ...
Phil> Your reply smacks of a rather distasteful approach towards
Phil> gotos: That they shouldn't ever be used. Gotos have a definite
Phil> use as Knuth pointed out in his response to Djikstra.
Please, quote the part where I said something against gotos.
Phil> That it doesn't matter is clearly false given this snippet of
Phil> code. It saves code space.
Phil> Does it make the code any less readable to have it exit at a
Phil> common point?
Phil> In my opinion, I've found multiple exits harder to track down
Phil> than gotos pointing at a common exit point in code.
Phil> The label in this case is well chosen. "out" is pretty
Phil> obviously leading out of the function. I don't think this
Phil> qualifies as spaghetti code seeing as how it jumps to a well
Phil> defined label and out.
Phil> As it is now, it is quantifiably better to use gotos here rather
Phil> than returns. I don't think readability is hurt at all by it.
Please, stop with the strawmen, ok ? I've never said anything about
the readability of the function.
As for "quantifiably better", how about defining it ? Do you mind
defining better as either of:
a) smaller code size
b) smaller amount of cycles spent, due to instructions
c) smaller number of cycles spent, due to memory loads/stores
And note the "either of", in different cases people may prefer to
strive for different one (if they conflict). Also I make difference
between cycles spent for the reasons in b) or c) because on different
platforms their relative significance differs.
So, let's take a small stand-alone example:
The goto variant (hereafter denoted x1.c):
------------------------------------------
int foo (), bar ();
int
baz ()
{
int ret = 1;
if (foo ())
{
ret = 0;
goto out;
}
bar ();
out:
return ret;
}
The gotoless variant (hereafter denoted x2.c):
------------------------------------------
int foo (), bar ();
int
baz ()
{
int ret = 1;
if (foo ())
{
ret = 0;
return ret;
}
bar ();
out:
return ret;
}
The compiler is:
$ gcc --version
gcc (GCC) 3.3.2
1. At first compile it with ``-S -O3 -fomit-frame-pointer''
x1.s:
-----
.file "x1.c"
.text
.p2align 4,,15
.globl baz
.type baz, @function
baz:
subl $12, %esp
movl %ebx, 8(%esp)
movl $1, %ebx
call foo
testl %eax, %eax
je .L2
xorl %ebx, %ebx
.L3:
movl %ebx, %eax
movl 8(%esp), %ebx
addl $12, %esp
ret
.p2align 4,,7
.L2:
call bar
jmp .L3
.size baz, .-baz
.ident "GCC: (GNU) 3.3.2"
x2.s:
-----
.file "x2.c"
.text
.p2align 4,,15
.globl baz
.type baz, @function
baz:
subl $12, %esp
call foo
xorl %edx, %edx
testl %eax, %eax
je .L4
.L1:
movl %edx, %eax
addl $12, %esp
ret
.p2align 4,,7
.L4:
.L3:
call bar
movl $1, %edx
jmp .L1
.size baz, .-baz
.ident "GCC: (GNU) 3.3.2"
Ok, first thing we note is that there's single epilogue sequence in
both variants. And the second thing is there are some extra
instructions to save/restore %ebx in the first variant, which are
extra cycles on two accounts - as extra instructions and as extra
memory accesses.
Summary: a) code size - x2.c (gotoless) is smaller with 2 insn
b) insn cycles - x2.c executes less insns
c) memory cycles - x2.c performs less loads/stores.
IOW, gotoless variant is "better" according to the above criteria.
2. As we were speaking of code size, let's compile with for code size:
``-S -Os -fomit-frame-pointer''.
x1.s:
-----
.file "x1.c"
.text
.globl baz
.type baz, @function
baz:
pushl %ebx
movl $1, %ebx
call foo
testl %eax, %eax
je .L2
xorl %ebx, %ebx
jmp .L3
.L2:
call bar
.L3:
movl %ebx, %eax
popl %ebx
ret
.size baz, .-baz
.ident "GCC: (GNU) 3.3.2"
x2.s:
-----
.file "x2.c"
.text
.globl baz
.type baz, @function
baz:
call foo
xorl %edx, %edx
testl %eax, %eax
jne .L1
.L3:
call bar
movl $1, %edx
.L1:
movl %edx, %eax
ret
.size baz, .-baz
.ident "GCC: (GNU) 3.3.2"
Again, there's a single epilogue and the goto variant clobbers %ebx
(and thus needs to save/restore it)
Summary: a) code size - x2.s is smaller by 3 insns
b) insn cycles - x2.s executes less insns
c) memory cycles - x2.s executes less memory loads/stores
IOW, gotoless variant is "better" according to the above criteria.
Let's see what happens on a different architecture.
$ arm-elf-gcc --version
arm-elf-gcc (GCC) 3.3.2
3. Compile with ``-S -O3 -fomit-frame-pointer''
x1.s:
-----
.file "x1.c"
.text
.align 2
.global baz
.type baz, %function
baz:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 1, uses_anonymous_args = 0
mov ip, sp
stmfd sp!, {r4, fp, ip, lr, pc}
sub fp, ip, #4
bl foo
cmp r0, #0
mov r4, #1
movne r4, #0
bleq bar
.L3:
mov r0, r4
ldmea fp, {r4, fp, sp, pc}
.size baz, .-baz
.ident "GCC: (GNU) 3.3.2"
x2.s:
-----
.file "x2.c"
.text
.align 2
.global baz
.type baz, %function
baz:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 1, uses_anonymous_args = 0
mov ip, sp
stmfd sp!, {fp, ip, lr, pc}
sub fp, ip, #4
bl foo
cmp r0, #0
mov r0, #0
ldmneea fp, {fp, sp, pc}
.L3:
bl bar
mov r0, #1
ldmea fp, {fp, sp, pc}
.size baz, .-baz
.ident "GCC: (GNU) 3.3.2"
The second variant contains two epilogues, the first variant again
gratuitously clobbers a register, with the following increase in
instruction cycles and memory accesses.
Summary: a) code size - both are 10 insns
b) insn cycles - x2.c is better because in one case it
executes 7 insn in the other 10 insn, where are x1.c
executes 10 insns in either case.
c) memory cycles - x1.c does an extra save/restore of r4
IOW, gotoless variant is "better" according to the above criteria.
4. Compiling for code size - ``-S -Os -fomit-frame-pointer''
Produces identical output to the previous.
Now, what does it all prove ? Of course, it PROVES nothing (just
like your supposed compilations do), it DEMONSTRATES something.
It demonstrates that compilers are a lot smarter than one may think.
Putting ``goto'' in the source does not necessarily mean the compiler
will generate unconditional jump instruction at that point. Jump
optimizations (e.g. jump-to-jump elimination) can eliminate gotos
altogether. Basic block reordering can totaly change the "shape" of the
code, as "apparent" from the C source. Conditional instructions
together with if conversion can eliminate jumps too. Etc., etc. ...
Anyway, in any case DON'T LIE TO THE COMPILER! There's ONLY ONE
reason people generally write better assembler code than the compiler
-- because they know more about the program than the compiler does.
Thus, help the compiler by telling it more about the program. Avoid
casts. Use ``restrict''. Use whatever #pragma's there are. Use
__attribute__. Use __builtin_expect . If you want do to a return, do
a return. The compiler can infer a lot more from a ``return'' than
from a ``goto''.
And note that this is not against gotos. They have valid uses, I
personally use it often (almost exclusively as a poor man's exception
handling).
~velco
--
Kernelnewbies: Help each other learn about the Linux kernel.
Archive: http://mail.nl.linux.org/kernelnewbies/
FAQ: http://kernelnewbies.org/faq/