Skip to content

Commit 1ff3ac8

Browse files
fabianbs96mxHuberclaude
authored
Union-Find Alias Analyses (#820)
* Initial PAGBuilder * Add UnionFind + minor * Optional type-erasure for PBStrategy + add PBStrategyCombinator * Add basic union-find based Steensgaard-style alias analysis (WIP) * Add utility for UnionFindAAResults intersection and caching + make LLVM-based UnionFindAA compile * Add alias-iterator for union-find alias-results * Add LLVMUnionFindAliasIterator + NonNullPtr * Start adding unit-tests for union-find-AA * Fix PBMixin + add some comments * Added more basic and context tests for pointers * Small fix in IndirectionSensUnionFindAA::onAddValue * Add summary-based union-find alias-analysis * context tests with different depth * renamed tests * More context tests * fixed error * removed bad basic tests + expanded unittest * added context_08 to unittest, fails * added more context tests to unittest * Add soundness to BottomupUnionFindAA * Add convenience-functions to compute union-find alias info for LLVM * minor * added more tests + fixed some old ones * Fixed context unit tests * added CtxSens and IndirSens unittests * Some fix in BottomupUnionFindAA * Fix error due to merge * Start adding LLVMUnionFindAliasSet (WIP) * Continue with LLVMUnionFindAliasSet * Integrate Union-Find-AA into phasar-cli * debug information * disabled context_01 indir test due to a crash * Add library summar to PAG * fixed Indir Context01 * fixed LLVMAliasSet tests * Fixed Indir Context02, 03 and 04_0 * Enable passing LLVMFunctionDataFlowFacts into LLVMPAGBuilder * added missing allocas to ground truths 01 - 04_0 * Fixed ground truth of Indir Context Tests til 08 * new basic03, added basic04, gt fixes til 10_1 * Last of Indir Context tests fixed * New tests * fixed cmakelists for cpp tests * Indir Unittests output comments + 01, 02 half done * indir unittests * Add TracingPBStrategy to dump the PAG * new indir tests + fixed crashes * fixed some unittests * fixed half of ctx-sens Context tests * Nearly all ctx-sens context tests done * FuncByName TestingSrcLocation * Update breaking changes * Fix CtxSensUnionFindAATest.Indirection01 * commit for merge * Half of Indir Tests fixed * all unittests working now * test tool * Fix LLVMUnionFindAliasSet dependency to phasar_llvm_controlflow * Add README * minor * added other two analyses + results table * fixed table formatting * Some cleanup * Add ValueIdMap to use as cache for alias-sets * minor * pre-commit * Cleanup + bug fixing * Add Doxygen comments to new alias-analysis headers Documents edge types, concepts, strategy combinators, analysis variants (BasicUnionFindAA, CallingContextSens, IndirectionSens, BottomupUnionFindAA), LLVM adapters, and utility types (RawAliasSet, UnionFind, ValueCompressor, TypedArray, CallingContextConstructor). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * minor * Make ValueIdMap allocator-aware * Extend ValueIdMap STL API: range ctor, ilist ctor/assign, erase(const_iterator), operator==, max_size - Add iterator-range constructor with O(1) reserve for random-access iterators - Add initializer_list constructor (delegating to range ctor) and assignment operator, both accepting any V constructible to ValueT to avoid unnecessary copies - Replace erase(iterator) with erase(const_iterator) per STL convention - Add operator== hidden friend: compares IsSet bitsets first, then values directly via slot() without find(), for lower constant vs. a naive lookup-per-element - Add max_size() capped at INT_MAX (llvm::BitVector internal limit) - Add missing type aliases: difference_type, reference, const_reference, pointer, const_pointer - Add #include <limits> and <initializer_list> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * LLVM 17 compatibility * Fix library-summary relocation with C++20 modules * Fix new dependency of phasar_llvm_pointer to phasar_controlflow * ci: ccache dedup per llvm version * Silence self-assign warning in SelfAssign test for ValueIdMap --------- Co-authored-by: mxHuber <huber.maximilian.leo@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 477f30d commit 1ff3ac8

131 files changed

Lines changed: 14823 additions & 295 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/ci.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ jobs:
5151
- name: ccache # This should always come after the actions/checkout step.
5252
uses: hendrikmuhs/ccache-action@v1.2
5353
with:
54-
key: ${{ github.job }}-${{ matrix.os }}-${{ matrix.build }}
54+
key: ${{ github.job }}-${{ matrix.os }}-llvm${{ matrix.llvm-version }}-${{ matrix.build }}
5555

5656
- name: Install Phasar Dependencies
5757
shell: bash

.github/workflows/deploy-docs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ permissions:
88
contents: write
99
jobs:
1010
build-and-deploy:
11+
if: github.repository == 'secure-software-engineering/phasar'
1112
runs-on: ubuntu-24.04
1213
strategy:
1314
fail-fast: true

BreakingChanges.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
## development HEAD
44

5+
- The function `HelperAnalyses::getAliasInfo()` no longer returns a `LLVMAliasSet &`, but a `LLVMAliasInfoRef`.
6+
- The location of the library summary `FunctionDataFlowFacts` and `LLVMFunctionDataFlowFacts` has moved to `phasar/Utils/` and `phasar/PhasarLLVM/Utils`, respectively.
57
- `IDESolver::initialize()` does no longer return a `bool`. Now, you are always allowed to call `next()` at least once.
68
- `IntraMonoProblem` and `InterMonoProblem`, and all reference-implementations of these problems do not receive a TypeHierarchy-pointer anymore in the ctor.
79
- Requiring C++20 instead of C++17

CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -410,7 +410,7 @@ if (PHASAR_ENABLE_WARNINGS)
410410
if (MSVC)
411411
string(APPEND CMAKE_CXX_FLAGS " /W4")
412412
else()
413-
string(APPEND CMAKE_CXX_FLAGS " -Wall -Wextra -Wno-unused-parameter")
413+
string(APPEND CMAKE_CXX_FLAGS " -Wall -Wextra -Wno-unused-parameter -Werror=return-type")
414414
endif()
415415
endif (PHASAR_ENABLE_WARNINGS)
416416

include/phasar/ControlFlow/CallGraphBase.h

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,14 @@
1212

1313
#include "phasar/Utils/ByRef.h"
1414
#include "phasar/Utils/CRTPUtils.h"
15+
#include "phasar/Utils/Compressor.h"
16+
#include "phasar/Utils/GraphTraits.h"
17+
#include "phasar/Utils/IotaIterator.h"
18+
#include "phasar/Utils/NonNullPtr.h"
1519
#include "phasar/Utils/TypeTraits.h"
1620

21+
#include "llvm/ADT/STLExtras.h"
22+
1723
#include <concepts>
1824
#include <type_traits>
1925

@@ -127,6 +133,92 @@ template <typename Derived> class CallGraphBase : public CRTPBase<Derived> {
127133

128134
[[nodiscard]] constexpr bool empty() const noexcept { return size() == 0; }
129135
};
136+
137+
/// A view over a call-graph that implements psr::GraphTraits on the
138+
/// getCallersOf() relation.
139+
/// Useful to compute CG-SCCs.
140+
template <typename CallGraphTy, typename DB> class ReverseCGGraph {
141+
static constexpr bool NeedsMapping = !IdType<typename CallGraphTy::f_t>;
142+
143+
enum class FunctionIdImpl : uint32_t {};
144+
145+
public:
146+
using FunctionId = std::conditional_t<NeedsMapping, FunctionIdImpl,
147+
typename CallGraphTy::f_t>;
148+
149+
constexpr ReverseCGGraph(
150+
NonNullPtr<const CallGraphTy> CGView, NonNullPtr<const DB> IRDB,
151+
Compressor<typename CallGraphTy::f_t, FunctionId> FC) noexcept
152+
requires(NeedsMapping)
153+
: CGView(CGView), IRDB(IRDB), FC(std::move(FC)) {}
154+
155+
constexpr ReverseCGGraph(NonNullPtr<const CallGraphTy> CGView,
156+
NonNullPtr<const DB> IRDB) noexcept
157+
: CGView(CGView), IRDB(IRDB) {
158+
FC.reserve(CGView->getNumVertexFunctions());
159+
for (const auto &Fun : CGView->getAllVertexFunctions()) {
160+
FC.insert(Fun);
161+
}
162+
}
163+
164+
NonNullPtr<const CallGraphTy> CGView;
165+
NonNullPtr<const DB> IRDB;
166+
[[no_unique_address]] std::conditional_t<
167+
NeedsMapping, Compressor<typename CallGraphTy::f_t, FunctionId>,
168+
NoneCompressor> FC{};
169+
};
170+
171+
template <typename CallGraphTy, typename DB>
172+
ReverseCGGraph(NonNullPtr<const CallGraphTy>, NonNullPtr<const DB>)
173+
-> ReverseCGGraph<CallGraphTy, DB>;
174+
template <typename CallGraphTy, typename DB>
175+
ReverseCGGraph(const CallGraphTy *, const DB *)
176+
-> ReverseCGGraph<CallGraphTy, DB>;
177+
178+
template <typename CallGraphTy, typename DB>
179+
struct GraphTraits<ReverseCGGraph<CallGraphTy, DB>> {
180+
using graph_type = ReverseCGGraph<CallGraphTy, DB>;
181+
using value_type = typename CallGraphTy::f_t;
182+
using vertex_t = typename graph_type::FunctionId;
183+
using edge_t = vertex_t;
184+
185+
static constexpr vertex_t Invalid = vertex_t(UINT32_MAX);
186+
187+
static constexpr auto mapToFunction(const graph_type &G) {
188+
return [&G](ByConstRef<typename CallGraphTy::n_t> Inst) {
189+
const auto &Fun = G.IRDB->getFunctionOf(Inst);
190+
return G.FC.getOrNull(Fun).value();
191+
};
192+
}
193+
194+
static constexpr auto outEdges(const graph_type &G, vertex_t Vtx) {
195+
return llvm::map_range(G.CGView->getCallersOf(G.FC[Vtx]), mapToFunction(G));
196+
}
197+
198+
static constexpr decltype(auto) nodes(const graph_type &G) {
199+
return G.CGView->getAllVertexFunctions();
200+
}
201+
202+
// TODO: Roots
203+
204+
static constexpr auto vertices(const graph_type &G) {
205+
return iota<vertex_t>(G.CGView->getNumVertexFunctions());
206+
}
207+
208+
static constexpr decltype(auto) node(const graph_type &G, vertex_t Vtx) {
209+
return G.FC[Vtx];
210+
}
211+
212+
static constexpr size_t size(const graph_type &G) {
213+
return G.CGView->getNumVertexFunctions();
214+
}
215+
216+
static constexpr vertex_t target(edge_t Edge) { return Edge; }
217+
static constexpr vertex_t withEdgeTarget(edge_t /*Edge*/, vertex_t Vtx) {
218+
return Vtx;
219+
}
220+
};
221+
130222
} // namespace psr
131223

132224
#endif // PHASAR_CONTROLFLOW_CALLGRAPHBASE_H

include/phasar/DB/ProjectIRDBBase.h

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
#ifndef PHASAR_DB_PROJECTIRDBBASE_H
1111
#define PHASAR_DB_PROJECTIRDBBASE_H
1212

13+
#include "phasar/Utils/ByRef.h"
1314
#include "phasar/Utils/Nullable.h"
1415
#include "phasar/Utils/TypeTraits.h"
1516

@@ -84,6 +85,13 @@ class LLVM_DEPRECATED(
8485
return self().hasFunctionImpl(FunctionName);
8586
}
8687

88+
/// Returns the function that contains the given instruction Inst.
89+
/// Each instruction must be part of a function.
90+
[[nodiscard]] f_t getFunctionOf(ByConstRef<n_t> Inst) const {
91+
assert(isValid());
92+
return self().getFunctionOfImpl(Inst);
93+
}
94+
8795
/// Returns the global variable if available, nullptr/nullopt
8896
/// otherwise.
8997
[[nodiscard]] Nullable<g_t>

include/phasar/DataFlow/IfdsIde/Solver/Compressor.h

Lines changed: 0 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -9,27 +9,6 @@
99

1010
namespace psr {
1111

12-
struct NoneCompressor final {
13-
constexpr NoneCompressor() noexcept = default;
14-
15-
template <typename T>
16-
requires(!std::is_same_v<NoneCompressor, T>)
17-
constexpr NoneCompressor(const T & /*unused*/) noexcept {}
18-
19-
template <typename T>
20-
[[nodiscard]] decltype(auto) getOrInsert(T &&Val) const noexcept {
21-
return std::forward<T>(Val);
22-
}
23-
template <typename T>
24-
[[nodiscard]] decltype(auto) operator[](T &&Val) const noexcept {
25-
return std::forward<T>(Val);
26-
}
27-
void reserve(size_t /*unused*/) const noexcept {}
28-
29-
[[nodiscard]] size_t size() const noexcept { return 0; }
30-
[[nodiscard]] size_t capacity() const noexcept { return 0; }
31-
};
32-
3312
class LLVMProjectIRDB;
3413

3514
/// Once we have fast instruction IDs (as we already have in IntelliSecPhasar),

include/phasar/PhasarLLVM/ControlFlow/VTA/TypePropagator.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111
#define PHASAR_PHASARLLVM_CONTROLFLOW_TYPEPROPAGATOR_H
1212

1313
#include "phasar/PhasarLLVM/ControlFlow/VTA/TypeAssignmentGraph.h"
14+
#include "phasar/Utils/SCCId.h"
1415
#include "phasar/Utils/TypedVector.h"
1516

1617
#include "llvm/ADT/DenseSet.h"
@@ -22,7 +23,6 @@ class Value;
2223
} // namespace llvm
2324

2425
namespace psr {
25-
template <typename GraphNodeId> struct SCCId;
2626
template <typename GraphNodeId> struct SCCHolder;
2727
template <typename GraphNodeId> struct SCCDependencyGraph;
2828
template <typename GraphNodeId> struct SCCOrder;

include/phasar/PhasarLLVM/DB/LLVMProjectIRDB.h

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -189,6 +189,11 @@ class LLVMProjectIRDB : public ProjectIRDBBase<LLVMProjectIRDB> {
189189
hasFunctionImpl(llvm::StringRef FunctionName) const noexcept {
190190
return Mod->getFunction(FunctionName) != nullptr;
191191
}
192+
[[nodiscard]] f_t getFunctionOfImpl(n_t Inst) const {
193+
assert(Inst != nullptr);
194+
return Inst->getFunction();
195+
}
196+
192197
[[nodiscard]] g_t
193198
getGlobalVariableImpl(llvm::StringRef GlobalVariableName) const;
194199
[[nodiscard]] g_t

include/phasar/PhasarLLVM/DataFlow/IfdsIde/LLVMFunctionDataFlowFacts.h

Lines changed: 0 additions & 89 deletions
This file was deleted.

0 commit comments

Comments
 (0)