What Is the Meaning of Clang's -Wweak-Vtables

The meaning of __ and __ in clang variables and functions

_ and __ are just part of the identifier. They don't have any other special meaning.

However, identifiers containing a double underscore or starting with an underscore followed by an upper-case letter are reserved for the standard library. The user code is not allowed to declare them or define them as macro.

The standard library uses only such reserved identifiers for internal use to make sure that it doesn't interfere with any user code that is supposed to be valid.

For example this is a valid program:

#define sz
#define SizeT
#include<string_view>
int main() {}

If the standard library implementation shown in the question was using just sz and SizeT instead of __sz and _SizeT, this would fail to compile.

However

#define __sz
#define _SizeT
#include<string_view>
int main() {}

is not a valid program and therefore it is ok if it fails to compile.

What is the meaning of the 'AnalyzeTemporaryDtors' option in clang-tidy?

AnalyzeTemporaryDtors is an artifact from an older clang-tidy version, notably 6 and below.

The corresponding option -analyze-temporary-dtors= was removed, but since a lot of people do -dump-config (which dumps every option), removing AnalyzeTemporaryDtors apparently broke a lot of projects with such generated .clang-tidy, so it was added back for compatibility purposes: https://reviews.llvm.org/rG6e76a1b1ff98b27b82689b6294cde1d355be088f.

Feel free to remove it from your .clang-tidy.

How does Clang's did you mean ...? variable name correction algorithm work?

You will not likely find it documented, but as Clang is open-source you can turn to the source to try to figure it out.

Clangd?

The particular diagnostic (from DiagnosticSemaKinds.td):

def err_undeclared_var_use_suggest : Error<
"use of undeclared identifier %0; did you mean %1?">;

is ever only referred to from clang-tools-extra/clangd/IncludeFixer.cpp:

  // Try to fix unresolved name caused by missing declaration.
// E.g.
// clang::SourceManager SM;
// ~~~~~~~~~~~~~
// UnresolvedName
// or
// namespace clang { SourceManager SM; }
// ~~~~~~~~~~~~~
// UnresolvedName
// We only attempt to recover a diagnostic if it has the same location as
// the last seen unresolved name.
if (DiagLevel >= DiagnosticsEngine::Error &&
LastUnresolvedName->Loc == Info.getLocation())
return fixUnresolvedName();

Now, clangd is a language server and t.b.h. I don't know how whether this is actually used by the Clang compiler frontend to yield certain diagnostics, but you're free to continue down the rabbit hole to tie together these details. The fixUnresolvedName above eventually performs a fuzzy search:

if (llvm::Optional<const SymbolSlab *> Syms = fuzzyFindCached(Req))
return fixesForSymbols(**Syms);

If you want to dig into the details, I would recommend starting with the fuzzyFindCached function:

llvm::Optional<const SymbolSlab *>
IncludeFixer::fuzzyFindCached(const FuzzyFindRequest &Req) const {
auto ReqStr = llvm::formatv("{0}", toJSON(Req)).str();
auto I = FuzzyFindCache.find(ReqStr);
if (I != FuzzyFindCache.end())
return &I->second;

if (IndexRequestCount >= IndexRequestLimit)
return llvm::None;
IndexRequestCount++;

SymbolSlab::Builder Matches;
Index.fuzzyFind(Req, [&](const Symbol &Sym) {
if (Sym.Name != Req.Query)
return;
if (!Sym.IncludeHeaders.empty())
Matches.insert(Sym);
});
auto Syms = std::move(Matches).build();
auto E = FuzzyFindCache.try_emplace(ReqStr, std::move(Syms));
return &E.first->second;
}

along with the type of its single function parameter, FuzzyFindRequest in clang/index/Index.h:

struct FuzzyFindRequest {
/// A query string for the fuzzy find. This is matched against symbols'
/// un-qualified identifiers and should not contain qualifiers like "::".
std::string Query;
/// If this is non-empty, symbols must be in at least one of the scopes
/// (e.g. namespaces) excluding nested scopes. For example, if a scope "xyz::"
/// is provided, the matched symbols must be defined in namespace xyz but not
/// namespace xyz::abc.
///
/// The global scope is "", a top level scope is "foo::", etc.
std::vector<std::string> Scopes;
/// If set to true, allow symbols from any scope. Scopes explicitly listed
/// above will be ranked higher.
bool AnyScope = false;
/// The number of top candidates to return. The index may choose to
/// return more than this, e.g. if it doesn't know which candidates are best.
llvm::Optional<uint32_t> Limit;
/// If set to true, only symbols for completion support will be considered.
bool RestrictForCodeCompletion = false;
/// Contextually relevant files (e.g. the file we're code-completing in).
/// Paths should be absolute.
std::vector<std::string> ProximityPaths;
/// Preferred types of symbols. These are raw representation of `OpaqueType`.
std::vector<std::string> PreferredTypes;

bool operator==(const FuzzyFindRequest &Req) const {
return std::tie(Query, Scopes, Limit, RestrictForCodeCompletion,
ProximityPaths, PreferredTypes) ==
std::tie(Req.Query, Req.Scopes, Req.Limit,
Req.RestrictForCodeCompletion, Req.ProximityPaths,
Req.PreferredTypes);
}
bool operator!=(const FuzzyFindRequest &Req) const { return !(*this == Req); }
};

Other rabbit holes?

The following commit may be another leg to start from:

[Frontend] Allow attaching an external sema source to compiler instance and extra diags to TypoCorrections

This can be used to append alternative typo corrections to an existing
diag. include-fixer can use it to suggest includes to be added.

Differential Revision: https://reviews.llvm.org/D26745

from which we may end up in clang/include/clang/Sema/TypoCorrection.h, which sounds like a more reasonably used feature by the compiler frontend than that of the (clang extra tool) clangd. E.g.:

  /// Gets the "edit distance" of the typo correction from the typo.
/// If Normalized is true, scale the distance down by the CharDistanceWeight
/// to return the edit distance in terms of single-character edits.
unsigned getEditDistance(bool Normalized = true) const {
if (CharDistance > MaximumDistance || QualifierDistance > MaximumDistance ||
CallbackDistance > MaximumDistance)
return InvalidDistance;
unsigned ED =
CharDistance * CharDistanceWeight +
QualifierDistance * QualifierDistanceWeight +
CallbackDistance * CallbackDistanceWeight;
if (ED > MaximumDistance)
return InvalidDistance;
// Half the CharDistanceWeight is added to ED to simulate rounding since
// integer division truncates the value (i.e. round-to-nearest-int instead
// of round-to-zero).
return Normalized ? NormalizeEditDistance(ED) : ED;
}

used in clang/lib/Sema/SemaDecl.cpp:

// Callback to only accept typo corrections that have a non-zero edit distance.
// Also only accept corrections that have the same parent decl.
class DifferentNameValidatorCCC final : public CorrectionCandidateCallback {
public:
DifferentNameValidatorCCC(ASTContext &Context, FunctionDecl *TypoFD,
CXXRecordDecl *Parent)
: Context(Context), OriginalFD(TypoFD),
ExpectedParent(Parent ? Parent->getCanonicalDecl() : nullptr) {}

bool ValidateCandidate(const TypoCorrection &candidate) override {
if (candidate.getEditDistance() == 0)
return false;
// ...
}
// ...
};

Meaning of XCode clang error not found in architecture i386

The class IASKSettingsReader/IASKSettingsStoreUserDefaults/... is not being linked. Please check that you have it included on your Xcode project and in the Build Phases of your project under Compile Sources, if not add them.

How does clang check redefinitions?

TLDR; see Answers below.



Discussion

All of your questions are related to one term of C standard, identifier, in C99-6.2.1-p1:

An identifier can denote an object; a function; a tag or a member of a structure, union, or
enumeration; a typedef name; a label name; a macro name; or a macro parameter.

Each identifier has its own scope, one of the following, according to C99-6.2.1-p2:

For each different entity that an identifier designates, the identifier is visible (i.e., can be
used) only within a region of program text called its scope.

Since what you are interested in are the variables inside a function (i.e., int x), then it should then obtain a block scope.

There is an process called linkage for the identifiers in the same scope, according to C99-6.2.2-p2:

An identifier declared in different scopes or in the same scope more than once can be
made to refer to the same object or function by a process called linkage.

This is exactly the one that put a constraint that there should be only one identifier for one same object, or in your saying, definition legally checking. Therefore compiling the following codes

/* file_1.c */
int a = 123;

/* file_2.c */
int a = 456;

would cause an linkage error:

% clang file_*
...
ld: 1 duplicate symbol
clang: error: linker command failed with exit code 1

However, in your case, the identifiers are inside the same function body, which is more likely the following:

/* file.c */
int main(){
int b;
int b=1;
}

Here identifier b has a block scope, which shall have no linkage, according to C99-6.2.2-p6:

The following identifiers have no linkage: an identifier declared to be anything other than
an object or a function; an identifier declared to be a function parameter; a block scope
identifier for an object declared without the storage-class specifier extern.

Having no linkage means that we cannot apply the rules mentioned above to it, that is, it should not be related to a linkage error kind.

It is naturally considered it as an error of redefinition. But, while it is indeed defined in C++, which is called One Definition Rule, it is NOT in C.(check this or this for more details) There is no exact definition for dealing with those duplicate identifiers in a same block scope. Hence it is an implementation-defined behavior. This might be the reason why with clang, the resulting errors after compiling the above codes (file.c) differs from the ones by gcc, as shown below:

(note that the term 'with no linkage' by gcc)

# ---
# GCC (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04))
# ---
$ gcc file.c
file.c: In function ‘main’:
file.c:4:6: error: redeclaration of ‘b’ with no linkage
int b=1;
^
file.c:3:6: note: previous declaration of ‘b’ was here
int b;
^

# ---
# CLANG (Apple clang version 13.0.0 (clang-1300.0.29.3))
# ---
% clang file.c
file.c:4:6: error: redefinition of 'b'
int b;
^
file.c:3:6: note: previous definition is here
int b=1;
^
1 error generated.



Answers

With all things above, I think it suffices to answer your questions:

How clang perform the definition legally checking?

For global variables, either clang or gcc would follow the C standard rules, that is to say, they handle the so-called "redefinition errors" by the process called Linkage. For local variables, it is undefined behavior, or more precisely, implementation-defined behavior.

In fact, They both view the "redefinition" as an error. Although variable names inside a function body would be vanished after compiled (you can verify this in the assembly output), it is undoubtedly more natural and helpful for letting them be unique.

Am I able to get the variable table(If there exists such kind of things)?

Having not so much knowledge about clang internals, but according to the standards quoted above, along with an analysis of compiling, we can infer that IdentifierTable might not much fit your needs, since it exists in "preprocessing" stage, which is before "linking" stage. To take a look how clang compiler deals with duplicate variables (or more formally, symbols), and how to store them, you might want to check the whole project of lld, or in particular, SymbolTable.



Related Topics



Leave a reply



Submit