|
3 | 3 | FederatedCode Overview |
4 | 4 | ======================== |
5 | 5 |
|
6 | | -**FederatedCode** is a server to script and automate the process of |
7 | | -**Software Composition Analysis (SCA)** to identify any open source components |
8 | | -and their license compliance data in an application’s codebase. FederatedCode can be |
9 | | -used for various use cases, such as Docker container and VM composition |
10 | | -analyses, among other applications. |
| 6 | +**FederatedCode** is a solution for decentralized, and federated metadata about software |
| 7 | +applications, with a focus on known vulnerabilities, package versions, origin and licensing. |
| 8 | +These data are essential to support efficient **Software Composition Analysis (SCA)** with quality |
| 9 | +open reference data about open source components. |
| 10 | + |
11 | 11 |
|
12 | 12 | Why FederatedCode? |
13 | 13 | -------------------- |
14 | 14 |
|
15 | | -Modern software is built from many open source packages assembled with new code. |
16 | | -Knowing which free and open source code package is in use matters because: |
17 | | - |
18 | | -- You're required to **know the license of third-party code** before using it, and |
19 | | -- You want to avoid using buggy, outdated or vulnerable components. |
20 | | - |
21 | | -It's usually convenient to include and reuse new code downloaded from the |
22 | | -internet; however, it's often surprisingly hard to get a proper inventory of |
23 | | -all third-party code origins and licenses used in a software project. |
24 | | -There are some great tools available to scan your code and help uncover these |
25 | | -details. For example, when you reuse only a few FOSS components in a single |
26 | | -project, running one of these tools, such as the **ScanCode-toolkit**, manually |
27 | | -along with a spreadsheet might be enough to manage your software composition |
28 | | -analysis. |
29 | | - |
30 | | -However, when you scale up, running automated and reproducible analysis pipelines |
31 | | -that are adapted to a software project's unique context and technology platform |
32 | | -can be difficult. This will require deploying and running multiple specialized |
33 | | -tools and merge their results with a consistent workflow. Moreover, |
34 | | -when reusing thousands of open source packages is becoming commonplace, |
35 | | -code scans pipelines need to be scripted as code is running on servers backed |
36 | | -by a shared database, not on a laptop. |
37 | | - |
38 | | -For instance, when you analyze Docker container images, there could be hundreds |
39 | | -to thousands of system packages, such as Debian, RPM, Alpine, and application |
40 | | -packages, including npm, PyPI, Rubygems, Maven, installed in an image |
41 | | -side-by-side with your own code. Taking care of all this can be |
42 | | -an extremely hard task, and that's when **FederatedCode** comes into play to help |
43 | | -organizing these complex code analysis as scripted pipelines and store their |
44 | | -results in a database for automated code analysis. |
45 | | - |
46 | | -What is ScanPipe? |
47 | | ------------------ |
48 | | - |
49 | | -**ScanPipe** is a developer-friendly framework and application that helps |
50 | | -software analysts and engineers build and manage real-life software composition |
51 | | -analysis projects as scripted pipelines. |
52 | | - |
53 | | -**ScanPipe** provides a unified framework to the infrastructure that is |
54 | | -required to execute and organize these software composition analysis projects. |
55 | | - |
56 | | -Should I use ScanPipe? |
57 | | ----------------------- |
58 | | - |
59 | | -If you are working on a software composition analysis project, or you |
60 | | -are planning to start a new one, consider the following questions: |
61 | | - |
62 | | -1. **Automation**: Is the project part of a larger compliance program |
63 | | - (as opposed to a one-off) and that you require automation? |
64 | | -2. **Complexity**: Does the project use many third-party components or technologies? |
65 | | -3. **Reproducibility**: Is it important that the results are reproducible, traceable, |
66 | | - and auditable? |
67 | | - |
68 | | -If you answered **"yes"** to any of the above, keep reading - ScanPipe can help |
69 | | -you. If the answer is **"no"** to all of the above, which is a valid scenario, |
70 | | -e.g., when you are doing small-scale analysis, ScanPipe may provide only limited |
71 | | -benefit for you. |
72 | | - |
73 | | -The first set of available pipelines helps automate the analysis of Docker |
74 | | -container images and virtual machine (VM) disk images that often harbor |
75 | | -comprehensive software stacks from an operating system with its kernel through |
76 | | -system and application packages to original and custom applications. |
| 15 | +Modern software systems (and the organizations building and using them) rely on reusing free and |
| 16 | +open source software (FOSS). |
77 | 17 |
|
78 | | -Dependencies and Internal Tools |
79 | | -------------------------------- |
| 18 | +Knowing which free and open source code package is in use, its origin and security issues matters |
| 19 | +because: |
80 | 20 |
|
81 | | -FederatedCode is essentially a `Django <https://www.djangoproject.com/>`_-based |
82 | | -application wrapper around the |
83 | | -`ScanCode Toolkit <https://github.com/aboutcode-org/scancode-toolkit>`_ scanning engine. |
| 21 | +- You want to avoid using buggy, outdated or vulnerable components, and |
| 22 | +- You're required to **know the license of third-party code** before using it. |
84 | 23 |
|
85 | | -The **Django framework** is leveraged for many aspects of FederatedCode: |
| 24 | +This requires quality reference metadata to support efficient analysis and compliance processes |
| 25 | +automation. Existing FOSS metadata databases are centralized and "too big to share" with locked |
| 26 | +metadata behind gated APIs promoting lock-in and prohibiting privacy-preserving offline usage. |
86 | 27 |
|
87 | | -- :ref:`user_interface` |
88 | | -- :ref:`rest_api` |
89 | | -- :ref:`command_line_interface` |
90 | | -- :ref:`data_model` |
| 28 | +FederatedCode is a decentralized and federated system for FOSS metadata, enabling social review and |
| 29 | +sharing of curated metadata along with air-gapped, local usage to preserve privacy and |
| 30 | +confidentiality. |
91 | 31 |
|
92 | | -.. note:: |
93 | | - Multiple applications from the Django eco-system are also included, |
94 | | - see the `setup.cfg <https://github.com/aboutcode-org/federatedcode/blob/main/setup.cfg>`_ file |
95 | | - for an exhaustive list of dependencies. |
| 32 | +Because FederatedCode is decentralized and federated, it promotes sharing without having a single |
| 33 | +centralized ownehsip and point of control. |
96 | 34 |
|
97 | | -The second essential part of FederatedCode is the **ScanCode Toolkit**, which is used |
98 | | -for archives extraction and as the scanning engine. |
| 35 | +FederatedCode's distributed metadata collection process includes metadata crawling, curation and |
| 36 | +sharing, and its application to open source software package origin, license and vulnerabilities. |
| 37 | +The project strives to implement the concepts outlined in "Federated and decentralized metadata |
| 38 | +system" published at https://www.tdcommons.org/dpubs_series/5632/ |
99 | 39 |
|
100 | | -The nexB `container-inspector <https://github.com/aboutcode-org/container-inspector>`_ library |
101 | | -is also a key component of FederatedCode as this tool is used to analyse Docker |
102 | | -images, containers, root filesystems, and virtual machine images. |
103 | 40 |
|
104 | | -.. note:: |
105 | | - As a common practice, FederatedCode releases usually follow ScanCode Toolkit releases |
106 | | - to ensure the latest improvements of the scanning engines are included in the |
107 | | - latest release of FederatedCode. |
| 41 | +What is FederatedCode? |
| 42 | +--------------------------- |
108 | 43 |
|
| 44 | +**FederatedCode** is composed of multiple distributed sub-systems: |
109 | 45 |
|
110 | | -.. Some of this documentation is borrowed from the metaflow documentation and is also |
111 | | - under Apache-2.0 |
112 | | -.. Copyright (c) Netflix |
| 46 | +- A system to store versioned metadata as structure text (JSON, YAML) in multiple Git repositories |
| 47 | + structured to enable direct content retrieval using a Package URL (PURL), |
| 48 | +- A series of utilities to synchronize AboutCode dabatases with these versioned metadata, and |
| 49 | +- A system to publish package-centric events such as the release of a new package version, the |
| 50 | + publication of a vulnerability, the availability of detailes scans, analysis and SBOMs using |
| 51 | + publish/subscribe mechanism over ActivityPub. This further enables distributed discussions and |
| 52 | + curation of the data, in the open. |
0 commit comments