Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 352071 - [checker] Checker for invalid and discouraged casts
Summary: [checker] Checker for invalid and discouraged casts
Status: NEW
Alias: None
Product: CDT
Classification: Tools
Component: cdt-codan (show other bugs)
Version: 8.0   Edit
Hardware: PC Windows 7
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: CDT Codan Inbox CLA
QA Contact: Elena Laskavaia CLA
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 249566
  Show dependency tree
 
Reported: 2011-07-14 08:02 EDT by Tomasz Wesolowski CLA
Modified: 2019-03-23 15:11 EDT (History)
3 users (show)

See Also:


Attachments
[w.i.p.] checker for C (18.90 KB, patch)
2011-07-18 13:25 EDT, Tomasz Wesolowski CLA
no flags Details | Diff
[w.i.p.] checker + tests (36.47 KB, patch)
2011-08-04 17:38 EDT, Tomasz Wesolowski CLA
no flags Details | Diff
[w.i.p.] checker for c/c++, tests (46.22 KB, patch)
2011-08-11 17:08 EDT, Tomasz Wesolowski CLA
no flags Details | Diff
[review] checker for c/c++, tests (80.09 KB, patch)
2011-08-18 19:42 EDT, Tomasz Wesolowski CLA
no flags Details | Diff
[review] checker for c/c++, tests (65.82 KB, patch)
2011-08-18 19:46 EDT, Tomasz Wesolowski CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Tomasz Wesolowski CLA 2011-07-14 08:02:18 EDT
Invalid casts:
- Implicit cast when explicit cast is needed

Discouraged casts:
- C-style casts in C++
- Implicit narrowing numeric casts (explicit cast encouraged for clarity)
Comment 1 Tomasz Wesolowski CLA 2011-07-18 13:25:52 EDT
Created attachment 199851 [details]
[w.i.p.] checker for C

Work in progress

Working so far:
- looks for implicit casts in: initalizers, assignments, return statements
- checks according to rules of C99
- reports general problems 

TBD (+- in order):
- test cases for C, C++ (lots of those)
- handle C++
- rewrite reported problems to be more detailed

Is it OK for me to create a new problem sub-category for different types of discouraged casts? I think the user may want to separately set settings for warnings on singed/unsigned implicit cast, integral/float cast, int/enum cast etc ... There's going to be a few of those and it seems to make sense to group them in a new sub-group under Programming Style.
Comment 2 Tomasz Wesolowski CLA 2011-08-04 17:38:52 EDT
Created attachment 200947 [details]
[w.i.p.] checker + tests

I've managed to cover most of the issues in tests (turned out to be quite big), the remaining thing is covering user conversions (i.e. overloaded operators, copy constructors, etc). Needed quite a bit of reading and experimenting. The checker needs more work, but it's going to be downhill from now on I think.

I'm not really sure how strict should the checker be in terms of suggesting explicit casts where an implicit cast is possible. I prefer an explicit cast wherever something non-trivial or non-typesafe happens (be it C or C++), but there are C coders who cast implicitly a lot. The most common case is void* to anytype*.

The test case is commented richly, so please have a look. Feedback very much appreciated.
Comment 3 Andrew Gvozdev CLA 2011-08-05 11:56:00 EDT
Your checker deals with implicit casts as I see from the test cases. In practice, I wouldn't worry much about those (although, admittedly, if you really unlucky to hit it it could be really tricky to troubleshoot).

What I am facing in practice is a mass tendency especially from unexperienced folks to "fix" type complains from compiler just shoving explicit cast into the code making the compiler to shut up, especially when asked to clean up the code. What I was hoping to get from this bug is Codan warnings about *explicit* casts, in particular in C code. Normally explicit casts should be really rare except casting malloc() and few other exceptions but it is difficult to spot those violations of policy either visually or with a tool. Codan would be really helpful here. 

I see fairly often in the code casts like that:

char* fun1();
int rc = (int)fun1();

int fun2(struct X*);
struct X** ss;
fun2((struct X*)ss);

int fun3(char* str);
const char str[];
fun3((char*)str);

Those are far less obvious in real code. Would it be possible to cover things like that?
Comment 4 Tomasz Wesolowski CLA 2011-08-09 18:13:31 EDT
Thanks for the input, Andrew! That's a very reasonable point of view.

It's true that so far I've been trying to categorise all casts (be it implicit or explicit) into categories such as:

a) Safe and common (float->double, T*->void*, ...)
b) Potentially dangerous (void*->T*, unsigned->signed, ...)
c) Nonsense or incorrect (T*->int, float->enum, ...)

Then we could invent subcategories basing on what compiles (0), what compiles but needs an explicit cast (1) and what doesn't compile (2) - thus ending up with a, b0, b1, c0, c1 and c2. Especially in C, there's a lot in c0 and c1 because things which cause an UB still get compiled.

Let's have a look at your code snips:

> char* fun1();
> int rc = (int)fun1();

ptr->int, C0 in C and C++ (compiles even with implicit cast)

> int fun2(struct X*);
> struct X** ss;
> fun2((struct X*)ss);

ptr->incompatible ptr, C0 in C, C1 in C++ (needs reinterpret_cast to compile and still is wrong)

> int fun3(char* str);
> const char str[];
> fun3((char*)str);

CV-unsafe pointer conversion (cv-unsafe), B0 in C, B1 in C++ (needs const_cast and in general case is likely to be wrong). I'll get back to this specific issue later.

-----

I think that it would be useful to just make the checker additionally visit all the explicit casts and report problems on all C-rated casts found this way. This would resolve the initial two issues mentioned above.

About B-rated casts, especially B0 (those which compile implicitly but may be dangerous) - I believe it's good style to make a redundant explicit cast to *indicate that the cast is intentional*. So, on an example (C):

> void* ptr = something();
> int* iptr1 = ptr; // warning here - non-typesafe cast
> int* iptr2 = (int*) iptr; // no warning here - "i cast explicitly because I want to emphasize that I know what I'm doing"

> float f = 123.456f;
> int i1 = f; // warning here - potential precision loss
> int i2 = (int) f; // no warning here

Do you find such an approach as reasonable as I do?

-----
(warning: big rant about const casts follows)

I promised to talk more about this situation:

> int fun3(char* str);
> const char str[];
> fun3((char*)str);

There are cases when dropping a `const` qualifier is justified... Not this one of course :). Usually "justified" means just "working around some bad design which we can't correct" - the simplest example being a library function taking char* when it needs a const char*. C++ is armed with const_cast for a reason.

The C++ standard enforces const safety in implicit casts, C doesn't. Furthermore, C++ requires us to explicitly write a const_cast, which makes the code self-documenting. (Unfortunately, a C-style cast in C++ can perform a "hidden" const_cast, but discouraging C-style casts in C++ is another story).

Assume the situation I've mentioned, firstly in C++:
> void poor_library_func(char* c);
> const char* myStr = "...";
> poor_library_func(myStr); // doesn't compile, checker shows an error (unsafe implicit cast)
> poor_library_func(const_cast<char*>(myStr) // compiles, is self-documenting, checker shows no warnings

Then in C, analogously:
> poor_library_func(myStr); // compiles, checker shows an error (unsafe implicit cast)
> poor_library_func((char*)myStr); // compiles, checker... ???

How should the checker behave here? A C-style cast doesn't make the code self-documenting and the programmer may overlook that a non-const-safe cast is being performed here, so - contrary to the C++ case - this yearns for a warning. However, then there must be a way for the programmer to indicate that a non-const-safe cast is intended. 
One option is a comment pattern (only for this special case and only for C):

> poor_library_func(/* const cast */ (char*)myStr); // compiles, checker shows no warnings

This would ensure that the code is as self-documenting as possible, but would require extra work for the programmer.

This is untrivial and needs a decision. Maybe a Codan problem preference? Something like:
> Const safety in C casts:
> - Always warn
> - Require a confirmation via comment pattern: "pattern" (default)
> - Always ignore
Comment 5 Tomasz Wesolowski CLA 2011-08-11 17:08:18 EDT
Created attachment 201354 [details]
[w.i.p.] checker for c/c++, tests

The most of remaining part is handling classes, I've spend quite some time on tests for that and got the rules mostly worked out so the rest should go quick. This was harder to think over than I thought, but I'm quite happy with where this checker is heading.

Roadmap after that:
- handle explicit casts
- split problems into more detailed
Comment 6 Tomasz Wesolowski CLA 2011-08-18 19:42:45 EDT
Created attachment 201757 [details]
[review] checker for c/c++, tests

This version handles both C and C++ semantics and also looks at explicit casts as Andrew Gvozdev has suggested. There's still some cleanup to do (strings etc), some possible extensions etc. However I'd very much use a review now.

From last time I tried to take into consideration custom conversion sequences from C++ (using converting constructors, converting operators), but this approach proved to be very complex and probably very hard to get right here, as it is connected to many semantics like overload resolution.

A more feasible approach imho is to only consider standard conversion sequences and take into consideration that 

The conversion warnings have been split into 18 distinct problems. Some of them can be surpressed by an explicit cast, some don't.

When deciding where to warn, I've attempted to be more strict than what the standard enforces (esp. in C), but also to keep it reasonable.

One test is failing ATM: I haven't yet implemented base class accessibility checking (the compiler catches that anyway, so that's not a priority at the moment).

Some open issues:

- Should the conversion void* to type* be allowed implicit in C? It needs a cast by language in C++, but C allows this cast implicitly and in fact people don't usually, for instance, cast the result of malloc().

- What about using C casts which break const/volatile safety? They are allowed now. See the rant 2 comments above.

Please review.
Comment 7 Tomasz Wesolowski CLA 2011-08-18 19:46:05 EDT
Created attachment 201758 [details]
[review] checker for c/c++, tests

some irrelevant changes creeped in, fixed patch