Skip to content

lint ImproperCTypes: refactor linting architecture (part 2)#146273

Open
niacdoial wants to merge 2 commits intorust-lang:mainfrom
niacdoial:improperctypes-refactor2
Open

lint ImproperCTypes: refactor linting architecture (part 2)#146273
niacdoial wants to merge 2 commits intorust-lang:mainfrom
niacdoial:improperctypes-refactor2

Conversation

@niacdoial
Copy link
Copy Markdown
Contributor

@niacdoial niacdoial commented Sep 6, 2025

View all comments

This is the second PR in an effort to split #134697 (refactor plus overhaul of the ImproperCTypes family of lints) into individually-mergeable parts.

Contains the changes of the first PR, and splits the core type checking function into several bits, each focused on a specific aspect of FFI-safety.
Some logic which was outside of said core function was also moved into the new functions.

Superset of: #146271

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 6, 2025
Copy link
Copy Markdown
Contributor

@tgross35 tgross35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a lot of this makes sense, but aren't there behavior changes here? It looks like tuples and arrays may be treated slightly differently.

Which is probably fine, that would ideally just be split from the refactoring and come with test updates.

View changes since this review

Comment on lines +552 to +553
state: VisitorState,
outer_ty: Option<Ty<'tcx>>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like outer_ty is only used for checking where tuples are used, and taking the entire Ty seems a bit heavyweight for that. Could VisitorState instead get a new flag / flags giving the needed context?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in later commits, it is used to detect, er...

  • direct-use of a ty as an argument/return type/static (here)
  • being a pointee (and through which kind of indirection) (here)
  • if a ty::Slice is indeed a slice or if it is actually a !Sized array (here)

That seems like too much state to add into VisitorState. Unless we could forward the discriminant of outer_ty.kind() and use this? We would also need the mutability when said kind is Ref/RawPtr, though.

Also IIRC "an entire Ty" is only as big as a usize, so... actually I have no idea if that's the point you were making when saying it's "heavyweight".

Copy link
Copy Markdown
Contributor

@tgross35 tgross35 Sep 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not heavyweight as in actual size, I meant the amount of information. Since you're already looking at the outer type once in the loop, it would be it would be cleaner to extract the relevant info to some boolean flags at that time rather than passing the full type and needing to re-analyze it when you recurse. I think the relevant bits for the linked changes could easily enough be flags e.g. the existing FN_RETURN / IN_MUT_REF, IN_ARRAY, IS_UNSIZED_POINTEE.

The other benefit is not dealing with Ty * Ty possibilities, which means fewer unexpected combinations to possibly bug! on.

Maybe just use a flag for this commit? And if needed later, it can be turned into a context struct.

@niacdoial niacdoial force-pushed the improperctypes-refactor2 branch from 1dbb0e2 to f54061c Compare September 6, 2025 20:31
@niacdoial
Copy link
Copy Markdown
Contributor Author

I think a lot of this makes sense, but aren't there behavior changes here? It looks like tuples and arrays may be treated slightly differently.

Which is probably fine, that would ideally just be split from the refactoring and come with test updates.

I moved the actual change in behaviour in a later commit.
The rest of the changes here are just an exercise in moving more of the type-checking logic into the visit_* methods.

@rust-log-analyzer

This comment has been minimized.

@niacdoial niacdoial force-pushed the improperctypes-refactor2 branch from f54061c to 66037fd Compare September 6, 2025 21:03
@rust-log-analyzer

This comment has been minimized.

@niacdoial niacdoial force-pushed the improperctypes-refactor2 branch from 66037fd to 2781ebd Compare September 6, 2025 22:23
Comment on lines +392 to +406
fn visit_scalar(&self, ty: Ty<'tcx>) -> FfiResult<'tcx> {
// FIXME(ctypes): for now, this is very incomplete, and seems to assume a x86_64 target
match ty.kind() {
// note: before rust 1.77, 128-bit ints were not FFI-safe on x86_64
ty::Int(..) | ty::Uint(..) | ty::Float(..) => FfiResult::FfiSafe,
ty::Bool => FfiResult::FfiSafe,

ty::Char => FfiResult::FfiUnsafe {
ty,
reason: fluent::lint_improper_ctypes_char_reason,
help: Some(fluent::lint_improper_ctypes_char_help),
},
_ => bug!("visit_scalar is to be called with scalar (char, int, float, bool) types"),
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind linking to the changes this helps facilitate?

Thinking about this a bit more; bool is always FFI-safe and char is always FFI unsafe, so there probably isn't any reason not to move them out of this function into the big match.

(sorry for a bit of back and forth here)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two things that visit_scalar would facilitate, but neither are implemented just yet.

  • look at the safety of these things outside of x86_64
  • lint on possibly-broken value assumptions (if you have a reference or a pattern type as the argument of a extern "C" fn for instance). The new visit_pattern logic would then submit the base type to visit_scalar.

Should I put this as a comment in the function?

that being said, yes I can move bool and char out of there.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

look at the safety of these things outside of x86_64

Which platforms is this a problem for? As far as I know, there shouldn't be any more known incompatibilities for scalars - and if there are, we should probably just fix them.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I... huh.
I kinda assumed there would be some and put off looking into them.

I expected there would be similar situations to u128-on-x86_64 where a type is defined in software (CPU registers can't hold it) and part of the standard but handled differently by different compiler/linker/stdlib stacks.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not aware of any remaining problems with the stable types, so I don’t think there is anything to account for here. I.e., visit_scalar can probably remain inlined. Technically there are ABI mismatches for f16 and f128, but we’re working to fix those before stabilization so won’t be linting on them.

where a type is defined in software (CPU registers can't hold it) and part of the standard but handled differently by different compiler/linker/stdlib stacks.

Fwiw whether or not a type fits in registers doesn’t really determine the compatibility level; the ABI provides a spec for what to do (sometimes types that could fit in registers aren’t even passed that way). The problem with x86 i128 is that there the ABI specified what to do and LLVM wasn’t following it. (Not intending to self-promote, but my post on the subject is worth a read if you haven’t seen it https://blog.rust-lang.org/2024/03/30/i128-layout-update/).

There are platforms that don’t specify __int128 in the ABI (like x86-32). We also won’t be linting on that because GCC/Clang don’t let you use __int128 here, so there isn’t anything to be compatible with (bit more in the last paragraph of the PR description at #137306)

Comment on lines +552 to +553
state: VisitorState,
outer_ty: Option<Ty<'tcx>>,
Copy link
Copy Markdown
Contributor

@tgross35 tgross35 Sep 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not heavyweight as in actual size, I meant the amount of information. Since you're already looking at the outer type once in the loop, it would be it would be cleaner to extract the relevant info to some boolean flags at that time rather than passing the full type and needing to re-analyze it when you recurse. I think the relevant bits for the linked changes could easily enough be flags e.g. the existing FN_RETURN / IN_MUT_REF, IN_ARRAY, IS_UNSIZED_POINTEE.

The other benefit is not dealing with Ty * Ty possibilities, which means fewer unexpected combinations to possibly bug! on.

Maybe just use a flag for this commit? And if needed later, it can be turned into a context struct.

@tgross35
Copy link
Copy Markdown
Contributor

tgross35 commented Sep 7, 2025

I moved the actual change in behaviour in a later commit.
The rest of the changes here are just an exercise in moving more of the type-checking logic into the visit_* methods.

"split type visiting into subfunctions" still has some changes right? Array went from just checking the type to checking whether or not it is in a function. Which is probably a reasonable change to make, it should just be its own thing (and come with a test update).

(Possible I'm missing something here)

@niacdoial
Copy link
Copy Markdown
Contributor Author

Array went from just checking the type to checking whether or not it is in a function.

That's a bit of logic that was moved from check_type to visit_type

Comment on lines +691 to +697
match outer_ty.map(|ty| ty.kind()) {
// C functions can return void
None | Some(ty::FnPtr(..)) => state.is_in_function_return(),
// most of those are not even reachable,
// but let's not worry about checking that here
_ => false,
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is is_in_function_return ever set when the outer type isn't FnPtr? Seems like it should be sufficient to check only that without the match.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's the subtlety here:
if you check, say extern "C" fn create_something() -> &():

  • it will start by looking at the &() type, for which outer_ty is None or Some(ty::FnPtr) (approximately) and state.is_in_function_return() is true.
  • Then, if it looks at the pointee, (), state.is_in_function_return() remains true, but outer_ty becomes Some(&()) (approximately).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, is there any reason to track whether or not we are in a function return at anything other than the first level? FN_RETURN could be renamed to DIRECT_FN_RETURN; save it at the top of visit_ty and .remove it from state (so it is unset if we recurse back to visit_ty via indirection), then use that saved value for the tuple check here.

Relates to #146273 (comment).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, is there any reason to track whether or not we are in a function return at anything other than the first level?

Just checked this. In a later commit in this chain, it will be used when dealing with uninhabited types.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(that being said, I'm working on something along the lines of that comment, and I'm realising that what I called "VisitorState" should have been called something else, so that the new thing could get this name.)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like there are effectively some flags that only apply once (i.e. die when you call visit_ty on a nested type) and some flags that need to persist. I don't think it would be unreasonable to have both DIRECT_FN_RETURN that gets reset, and WITHIN_FN_RETURN that sticks around for all calls if you need both.

A clear separation may be useful if there are more:

impl VisitorState {
    const NO_RECURSE_FLAGS: Self = Self::DIRECT_FN_RETURN | Self::SOME_OTHER_FLAG;

    /// Reset flags that shouldn't be persisted through recursion, returning them.
    fn no_recurse_flags(&mut self) -> Self {
        let ret = self.intersection(Self::NO_RECURSE_FLAGS);
        self.remove(Self::NO_RECURSE_FLAGS);
        ret
    }
}

fn visit_ty(mut state, ...) {
    let no_recurse_flags = state.no_recurse_flags();

    // use no_recurse_flags.contains(DIRECT_FN_RETURN) for the tuple check
}

@niacdoial niacdoial force-pushed the improperctypes-refactor2 branch 2 times, most recently from efe195a to 69b0807 Compare September 11, 2025 21:56
@rust-log-analyzer

This comment has been minimized.

@niacdoial niacdoial force-pushed the improperctypes-refactor2 branch from 69b0807 to 64106e6 Compare September 11, 2025 22:10
@tgross35 tgross35 self-assigned this Sep 12, 2025
@tgross35
Copy link
Copy Markdown
Contributor

Btw if these are ready for a more final review, feel free to un-draft them (just gets them actually into my queue)

@niacdoial niacdoial force-pushed the improperctypes-refactor2 branch from 64106e6 to 359cb79 Compare September 19, 2025 21:15
@niacdoial
Copy link
Copy Markdown
Contributor Author

just double-checked:
I'm pretty sure I covered all the things you made reviews on
(the one change in this force-push is renaming IndirectionType->IndirectionKind)

@tgross35
Copy link
Copy Markdown
Contributor

@niacdoial what exactly are these waiting on? I assume they're close to ready based on your above comment, but they are still marked as drafts.

@niacdoial niacdoial marked this pull request as ready for review September 22, 2025 18:38
@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Sep 22, 2025
@niacdoial
Copy link
Copy Markdown
Contributor Author

niacdoial commented Sep 22, 2025

ah, I knew I was missing something (talking about the PR still being a draft)

Copy link
Copy Markdown
Contributor

@tgross35 tgross35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay here, I've been behind for al little while. Should be catching up now, though

View changes since this review

Comment on lines +306 to +307
/// To annotate pointees (through Ref,RawPtr,Box).
const IN_PTR = 0b000001;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does IN_ represent here? Same for IN_ADT

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's building on the mental model of "entering" and "exiting" types as they are visited: the flag IN_ADT tells that the currently checked at this depth is a field to the ADT that's "one layer outside".
same with IN_PTR, saying that we are dealing with the pointee of what's "one layer outside".
...though now that you point it out, the flags' names make them sound like they would also be applied recursively when checking "further in", which is not the case.

const NO_OUTER_TY = 0b100000;
/// For NO_OUTER_TY cases, show that we are being directly used by a FnPtr specifically
/// FIXME(ctypes): this is only used for "bad behaviour" reproduced for compatibility's sake
const NOOUT_FNPTR = 0b1000000;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping the NO_ prefix

        const NO_OUT_FNPTR = 0b1000000;

Comment on lines +317 to +319
/// To show that there is no outer type, the current type is directly used by a `static`
/// variable or a function/FnPtr
const NO_OUTER_TY = 0b100000;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be the all zeros case, i.e. the default?

@rustbot rustbot removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 21, 2025
@tgross35
Copy link
Copy Markdown
Contributor

I'm really sorry, I haven't had the time to get back to all my reviews and should have rerolled this long ago. In the future, please feel free to ping, message, or reroll if it goes this long.

r? compiler

@rustbot rustbot assigned petrochenkov and unassigned tgross35 Jan 23, 2026
@niacdoial
Copy link
Copy Markdown
Contributor Author

niacdoial commented Jan 23, 2026

sure!
I think I'll need to refamiliarise myself with the feedback so far, triple-check everything's dealt with.
Is there anything else you want me to do, or should I just wait for re-review for now? Unless you want me to rebase things before re-reviewing?

edit from the next day: ah wait I didn't process what you actually said. I'll rebase for the new reviewer. In any case, thanks for the help until this point!

I guess the main reason why I didn't ping earlier was that I wanted to do something first, eventhough it's not immediately relevant (it's not for these "part2" and "part3" PRs): getting a coverage report to make sure as much of this lint is as covered as possible. (I managed to make rustc spit some .profraw files only this week, and so far no luck getting the coverage of rustc_lint specifically.)

@niacdoial niacdoial force-pushed the improperctypes-refactor2 branch from ee32bf7 to bce41ce Compare January 24, 2026 16:44
@rustbot

This comment has been minimized.

@rust-bors

This comment has been minimized.

Another interal change that shouldn't impact rustc users.
The goal is to break apart the gigantic visit_type function into more
managable and easily-editable bits that focus on specific parts of FFI safety.
@niacdoial niacdoial force-pushed the improperctypes-refactor2 branch from bce41ce to eb49ba8 Compare March 20, 2026 07:18
@rustbot rustbot added the T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) label Mar 20, 2026
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Mar 20, 2026

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

@rust-log-analyzer

This comment has been minimized.

Another user-transparent change, unifying outer-type information and the
existing VisitorState flags.
@niacdoial niacdoial force-pushed the improperctypes-refactor2 branch from eb49ba8 to c52aa53 Compare March 20, 2026 21:11
@niacdoial
Copy link
Copy Markdown
Contributor Author

whoops, a .profraw file got committed and now I have a meaningless "T-bootstrap" label on the PR
also, sorry I didn't realise the process was still officially waiting on me, let me fix this
@rustbot review

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 21, 2026
@petrochenkov petrochenkov removed the T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) label Mar 25, 2026
@petrochenkov
Copy link
Copy Markdown
Contributor

I don't personally care about this part of the compiler, but I can gradually review and merge this piece-by-piece is you submit the changes in smaller single-purpose portions, each accompanied with motivation.
(Or I can reassign to someone else.)

@petrochenkov petrochenkov added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 25, 2026
@niacdoial
Copy link
Copy Markdown
Contributor Author

I guess I can do that?
The commits can probably be merged independently, and the first of them can maybe be split into two.
not much more than that. This PR is already a "part 2/3 of the first half" of the original thing.
Then again, if you think it's better to reroll, you can do it. I'm not running out of time here, hehe. Otherwise, thank you for accepting to take a look at this!

the motivation for this entire "first half" (three PRs) is to make the lint's system a lot easier to understand and tweak, so that the behaviour of the lint can be entirely changed later (second half)

so.. should I start working on splitting this PR further?

@petrochenkov
Copy link
Copy Markdown
Contributor

@rustbot reroll (I'm on vacation next week)
If this is still not reviewed in a couple of weeks, feel free to start splitting and assigning to me.

@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Mar 26, 2026

Error: Parsing assign command in comment failed: ...' reroll' | error: expected end of command at >| ' (I'm on v'...

Please file an issue on GitHub at triagebot if there's a problem with this bot, or reach out on #triagebot on Zulip.

@petrochenkov
Copy link
Copy Markdown
Contributor

@rustbot reroll

@rustbot rustbot assigned JohnTitor and unassigned petrochenkov Mar 26, 2026
@niacdoial
Copy link
Copy Markdown
Contributor Author

...oh right.
@rustbot review

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants