[ACCEPTED]-Strict aliasing rule and 'char *' pointers-strict-aliasing

Accepted answer
Score: 17

if we have two pointers, one of type char * and 8 another of type struct something * pointing to the same location, how 7 is it possible that the first aliases the 6 second but the second doesn't alias the 5 first?

It does, but that's not the point.

The 4 point is that if you have one or more struct somethings 3 then you may use a char* to read their constituent 2 bytes, but if you have one or more chars then 1 you may not use a struct something* to read them.

Score: 12

The wording in the referenced answer is 37 slightly erroneous, so let's get that ironed 36 out first: One object never aliases another object, but 35 two pointers can "alias" the same object (meaning, the 34 pointers point to the same memory location 33 - as M.M. pointed out, this is still not 32 100% correct wording but you get the Idea). Also, the 31 standard itself doesn't (to the best of 30 my knowledge) actually talk about strict 29 aliasing at all, but only gives rules, through 28 which kinds of expressions a object may 27 be accessed or not. Compiler flags like 26 '-fno-strict-aliasing' tell the compiler 25 whether it can assume the programmer followed 24 those rules (so it can perform optimizations 23 based on that assumption) or not.

Now to 22 your question: Any object can be accessed through 21 a pointer to char, but a char object (especially a char array) may 20 not be accessed through most other pointer 19 types. Based on that the compiler can/must 18 make the following assumptions:

  1. If the type of the actual object itself is not known, a char* and T* could always point to the same object (alias each other) -> symmetric relationship.
  2. If T1and T2 are not "related" and not char, then T1* and T2* may never point to the same object -> symmetric relationship
  3. A char* may point to a char OR a T object
  4. A T* may NOT point to an char object -> asymmetric relationship

I believe, the 17 main rationale behind the asymmetric rules about accessing 16 object through pointers is that a char array might 15 not satisfy the alignment requirements of 14 e.g. an int.

So, even without compiler optimizations 13 based on the strict aliasing rule, e.g. writing 12 an int to the location of a 4-byte char array at 11 addresses 0x1,0x2,0x3,0x4 will - in the 10 best case - result in poor performance and 9 - in the worst case - access a different 8 memory location, because the CPU instructions 7 might ignore the lowest two address bits 6 when writing a 4-byte value (so here this 5 might result in a write to 0x0,0x1,0x2 and 4 0x3).

Please also be aware that the meaning 3 of "related" differs from language to language 2 (between C and C++), but that is not relevant 1 for your question.

Score: 4

if we have two pointers, one of type char * and 24 another of type struct something * pointing to the same location, how 23 is it possible that the first aliases the 22 second but the second doesn't alias the 21 first?

Pointers don't alias each other; that's 20 sloppy use of language. Aliasing is when 19 an lvalue is used to access an object of a different 18 type. (Dereferencing a pointer gives an 17 lvalue).

In your example, what's important 16 is the type of the object being aliased. For 15 a concrete example let's say that the object 14 is a double. Accessing the double by dereferencing 13 a char * pointing at the double is fine because 12 the strict aliasing rule permits this. However, accessing 11 a double by dereferencing a struct something * is not permitted 10 (unless, arguably, the struct starts with 9 double!).

If the compiler is looking at a function 8 which takes char * and struct something *, and it does not have 7 available the information about the object 6 being pointed to (this is actually unlikely 5 as aliasing passes are done at a whole-program 4 optimization stage); then it would have 3 to allow for the possibility that the object 2 might actually be a struct something *, so no optimization 1 could be done inside this function.

Score: 0

Many aspects of the C++ Standard are derived 78 from the C Standard, which needs to be understood 77 in the historical context when it was written. If 76 the C Standard were being written to describe 75 a new language which included type-based 74 aliasing, rather than describing an existing 73 language which was designed around the idea 72 that accesses to lvalues were accesses to 71 bit patterns stored in memory, there would 70 be no reason to give any kind of privileged 69 status to the type used for storing characters 68 in a string. Having explicit operations 67 to treat regions of storage as bit patterns 66 would allow optimizations to be simultaneously 65 more effective and safer. Had the C Standard 64 been written in such fashion, the C++ Standard 63 presumably would have been likewise.

As it 62 is, however, the Standard was written to 61 describe a language in which a very common 60 idiom was to copy the values of objects 59 by copying all of the bytes thereof, and 58 the authors of the Standard wanted to allow 57 such constructs to be usable within portable 56 programs.

Further, the authors of the Standard 55 intended that implementations process many 54 non-portable constructs "in a documented 53 manner characteristic of the environment" in 52 cases where doing so would be useful, but 51 waived jurisdiction over when that should 50 happen, since compiler writers were expected 49 to understand their customers' and prospective 48 customers' needs far better than the Committee 47 ever could.

Suppose that in one compilation 46 unit, one has the function:

void copy_thing(char *dest, char *src, int size)
{
  while(size--)
    *(char volatile *)(dest++) = *(char volatile*)(src++);
}

and in another 45 compilation unit:

float f1,f2;
float test(void)
{
  f1 = 1.0f;
  f2 = 2.0f;
  copy_thing((char*)&f2, (char*)&f1, sizeof f1);
  return f2;
}

I think there would have 44 been a consensus among Committee members 43 that no quality implementation should treat 42 the fact that copy_thing never writes to 41 an object of type float as an invitation to assume 40 that the return value will always be 2.0f. There 39 are many things about the above code that 38 should prevent or discourage an implementation 37 from consolidating the read of f2 with the 36 preceding write, with or without a special 35 rule regarding character types, but different 34 implementations would have different reasons 33 for their forfearance.

It would be difficult 32 to describe a set of rules which would require 31 that all implementations process the above 30 code correctly without blocking some existing 29 or plausible implementations from implementing 28 what would otherwise be useful optimizations. An 27 implementation that treated all inter-module 26 calls as opaque would handle such code correctly 25 even if it was oblivious to the fact that 24 a cast from T1 to T2 is a sign that an access 23 to a T2 may affect a T1, or the fact that 22 a volatile access might affect other objects 21 in ways a compiler shouldn't expect to understand. An 20 implementation that performed cross-module 19 in-lining and was oblivious to the implications 18 of typecasts or volatile would process such 17 code correctly if it refrained from making 16 any aliasing assumptions about accesses 15 via character pointers.

The Committee wanted 14 to recognize something in the above construct 13 that compilers would be required to recognize 12 as implying that f2 might be modified, since 11 the alternative would be to view such a 10 construct as Undefined Behavior despite 9 the fact that it should be usable within 8 portable programs. The fact that they chose 7 the fact that the access was made via character 6 pointer was the aspect that forced the issue 5 was never intended to imply that compilers 4 be oblivious to everything else, even though 3 unfortunately some compiler writers interpret 2 the Standard as an invitation to do just 1 that.

More Related questions