Skip to content

Commit 0e4975e

Browse files
committed
Add 2024 GSoC report Compute Summary for all detected packages
Signed-off-by: swastik <swastkk@gmail.com>
1 parent a9508b4 commit 0e4975e

2 files changed

Lines changed: 74 additions & 1 deletion

File tree

docs/source/archive/gsoc-toc.rst

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,16 @@ GSoC -- Google Summer of Code
66
open source software development. GSoC is completely online designed to encourage university
77
student participation in open source software development.
88
It was started by Google in 2005.
9-
More about GSoc - <https://summerofcode.withgoogle.com/about/>_
9+
More about GSoC - `<https://summerofcode.withgoogle.com/about/>`_
10+
11+
GSoC 2024
12+
---------
13+
14+
.. toctree::
15+
:maxdepth: 2
16+
17+
gsoc/reports/2024/scancode_toolkit_swastkk
18+
1019

1120
GSoC 2022
1221
---------
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
========================================================================
2+
Compute summary for all detected packages.
3+
========================================================================
4+
5+
6+
| **Organization:** `AboutCode <https://aboutcode.org>`_
7+
| **Project:** `Scancode Toolkit <https://github.com/aboutcode-org/scancode-toolkit>`_
8+
| **Mentee:** `Swastik Sharma (swastkk) <https://github.com/swastkk>`_
9+
| **Mentors:** Philippe Ombredanne, AyanSinhaMahapatra, AvishrantSh, Jonathan Yang, Jay Kumar
10+
11+
Overview
12+
--------
13+
14+
Previously we were computing the summary at the codebase level which involves `license_clarity_score`,
15+
`declared_holder`, `other_license_expressions` and many more. This project aims to improve scanning accuracy
16+
by computing summary and license clarity scores for each package and its files, rather than for the entire scan.
17+
This involves enhancing package models, and ensuring proper attribute collection for all package ecosystems.
18+
19+
Implementation
20+
--------------
21+
22+
All the work I did is contained in `this single PR <https://github.com/aboutcode-org/scancode-toolkit/pull/3792>`_.
23+
I added a new command line option called ``--package-summary`` that someone can use
24+
to get the package level summary within a single codebase. The package level summary involves the
25+
``license_clarity_score`` calculation and population of package attributes like ``copyright``,
26+
``holder``, ``other_license_expression``, ``notice_text``. This option must be called with ``--classify``
27+
option that helps ScanCode further classify scanned files/directories, to determine whether
28+
they fall in these categories `legal`, `readme`, `top-level`, `manifest` & ``--package`` or ``-p`` option
29+
detects various package manifests, lockfiles and package-like data and then assembles codebase level packages
30+
and dependencies from these package data detected at files. Also tags files if they are part of the packages.
31+
32+
This change allows users to get the more refined summary for each individual package that is present in a codebase.
33+
Also this feature improves the package assembly for various package ecosystems like npm, python-whl, rust, rubygems etc.
34+
35+
36+
Finally, all these changes are tested through multiple unit tests validating both correct
37+
behavior and error handling as needed.
38+
39+
Post GSoC
40+
---------
41+
42+
I would like to merge this PR into Scancode Toolkit, hopefully allowing users to leverage
43+
this feature to expand their package/codebase scanning capabilities.
44+
45+
Links
46+
-----
47+
48+
`Project idea <https://github.com/aboutcode-org/aboutcode/wiki/GSOC-2024-Project-Ideas#compute-summary-for-all-detected-packages>`_
49+
50+
`Official GSoC project page <https://summerofcode.withgoogle.com/programs/2024/projects/JzMlDtnM>`_
51+
52+
`GSoC Proposal <https://docs.google.com/document/d/1TcGqQVzXhTkz6Pmu9UaXAr4R4q1rlT4tof7H7dsVG0o/edit?usp=sharing>`_
53+
54+
Acknowledgements
55+
----------------
56+
57+
I would like to thank my mentors
58+
- `@pombredanne <https://github.com/pombredanne>`_
59+
- `@AyanSinhaMahapatra <https://github.com/AyanSinhaMahapatra>`_
60+
- `@AvishrantSh <https://github.com/AvishrantSsh>`_
61+
- `@35C4n0r <https://github.com/35C4n0r>`_
62+
63+
Weekly calls were greatly helpful and those special 1:1 call with `@AyanSinhaMahapatra` and `@pombredanne`
64+
were so amazing. Thank you for your time and your patience!

0 commit comments

Comments
 (0)