Is Swap Without Temporary a Good Idea?

January 31st, 2007 by Lorenzo

The classic way of swapping to variables uses a temporary variable. In C++ it looks something like:

void swap1(int &a, int &b)
{
     int c=a;
     a = b;
     b = c;
}

Some developers can’t stand that they need a temporary variable to make the swap, and prefer to write:

void swap2(int &a, int &b)
{
     a ^= b;
     b ^= a;
     a ^= b
}

Assuming that a and b are pointing to different memory locations, this actually works.
It is based on the fact that [ (a^b)^a = b ] and [ (b^a)^b = a ]

While this seems like a cool trick, is it really worth writing it? Or should you stick with the classics?
I say it is not worth it! Stick with the classics!!!

In fact:

It is slower

swap1 requires 3 assignments.
swap2 requires 3 XORs and 3 assignments.

If you are not convinced, here is the assembly generated by VS for swap1 (you get similar results with gcc and other compilers):

6:    void swap1(int &a, int &b)
7:    {
00401030   push        ebp
00401031   mov         ebp,esp
00401033   sub         esp,44h
00401036   push        ebx
00401037   push        esi
00401038   push        edi
00401039   lea         edi,[ebp-44h]
0040103C   mov         ecx,11h
00401041   mov         eax,0CCCCCCCCh
00401046   rep stos    dword ptr [edi]
8:        int c;
9:        c = a;
00401048   mov         eax,dword ptr [ebp+8]
0040104B   mov         ecx,dword ptr [eax]
0040104D   mov         dword ptr [ebp-4],ecx
10:       a = b;
00401050   mov         edx,dword ptr [ebp+8]
00401053   mov         eax,dword ptr [ebp+0Ch]
00401056   mov         ecx,dword ptr [eax]
00401058   mov         dword ptr [edx],ecx
11:       b = c;
0040105A   mov         edx,dword ptr [ebp+0Ch]
0040105D   mov         eax,dword ptr [ebp-4]
00401060   mov         dword ptr [edx],eax
12:   }

While for swap2 the assembly generated by VS is:

14:   void swap2(int &a, int &b)
15:   {
00401080   push        ebp
00401081   mov         ebp,esp
00401083   sub         esp,40h
00401086   push        ebx
00401087   push        esi
00401088   push        edi
00401089   lea         edi,[ebp-40h]
0040108C   mov         ecx,10h
00401091   mov         eax,0CCCCCCCCh
00401096   rep stos    dword ptr [edi]
16:        a ^= b;
00401098   mov         eax,dword ptr [ebp+8]
0040109B   mov         ecx,dword ptr [ebp+0Ch]
0040109E   mov         edx,dword ptr [eax]
004010A0   xor         edx,dword ptr [ecx]
004010A2   mov         eax,dword ptr [ebp+8]
004010A5   mov         dword ptr [eax],edx
17:        b ^= a;
004010A7   mov         ecx,dword ptr [ebp+0Ch]
004010AA   mov         edx,dword ptr [ebp+8]
004010AD   mov         eax,dword ptr [ecx]
004010AF   xor         eax,dword ptr [edx]
004010B1   mov         ecx,dword ptr [ebp+0Ch]
004010B4   mov         dword ptr [ecx],eax
18:        a ^= b;
004010B6   mov         edx,dword ptr [ebp+8]
004010B9   mov         eax,dword ptr [ebp+0Ch]
004010BC   mov         ecx,dword ptr [edx]
004010BE   xor         ecx,dword ptr [eax]
004010C0   mov         edx,dword ptr [ebp+8]
004010C3   mov         dword ptr [edx],ecx
19:   }

This should make it clear how swap2 is less efficient than swap1.

Provides no real advantage

The only real advantage is the rush you get from having written a piece of unreadable code using cool tricks.

The fact that you save sizeof(int) bytes on the stack in 99.99% of the cases is not a real good reason to write swap2 instead of swap1. Perhaps it is advantageous in some recursive situation, where every byte in the stack counts and where you don’t want to make an extra function call to a swap function. I’d have to see such code to believe it.

It is not readable

I know. If you wrote it, you know what it does. But what about the rest of the world?
The problem is that for many others swap2 is not going to be readable and it is going to generate only confusion, not amazement.

In the long term it is far better to write readable code than some cool unreadable and inefficient piece of work.

Keep it simple!!

3 Responses to “Is Swap Without Temporary a Good Idea?”

  1. Alex Says:

    Consider instead the more general case, not using memory storage in a properly optimized piece of code:
    where eax contains A, ebx contains B
    //Assuming ecx isn’t in use
    mov ecx,eax
    mov ebx,eax
    mov eax,ecx

    Now consider this using the “xor” method
    xor eax,ebx
    xor ebx,eax
    xor eax,ebx

    Speedwise these to small pieces of code perform quite similar, readability isn’t damaged, the advantage is saving a register. Ofcourse register interdependencies in the pipeline might vary depending on processor, but it’s hard not to justify using the xor method, due to the fact that it frees up a register to be used for something else.

  2. Adrian Boeing Says:

    Not that I advocate the XOR swap trick, but you might find it gets closer to what its meant to do if you switch to release mode:

    swap1:

    ; 16 : int c=a;

    mov eax, DWORD PTR [ecx]
    push esi

    ; 17 : a = b;

    mov esi, DWORD PTR [edx]
    mov DWORD PTR [ecx], esi

    ; 18 : b = c;

    mov DWORD PTR [edx], eax
    pop esi

    ; 19 : }

    swap2:
    ; 23 : a ^= b;

    mov eax, DWORD PTR [edx]
    xor DWORD PTR [ecx], eax
    mov eax, DWORD PTR [ecx]

    ; 24 : b ^= a;

    xor DWORD PTR [edx], eax
    mov edx, DWORD PTR [edx]

    ; 25 : a ^= b;

    xor DWORD PTR [ecx], edx

    I’m sure if you flicked a few more optimization options you would end up with some sensible assembler code.

  3. Lorenzo Says:

    Adrian: Looks pretty close, but still slower, and the source is still odd to look at.

    Alex: I was mostly interested in how a C/C++ compiler can optimize such code, and the advantage to a C/C++ programmer. In the case you suggest, if you were to write the code by hand in assembly, I agree it is not a bad choice. However, in my opinion, it is still unreadable in the sense that if you don’t know the “trick”, it is going to take a few minutes to find out how that works.

Leave a Reply

You must be logged in to post a comment.