Skip to content

Commit 6d06b9a

Browse files
Update 06-lsa.md
1 parent c0cd306 commit 6d06b9a

1 file changed

Lines changed: 5 additions & 1 deletion

File tree

episodes/06-lsa.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -112,9 +112,13 @@ To see this, let's begin to reduce the dimensionality of our TF-IDF matrix using
112112

113113
```python
114114
from sklearn.decomposition import TruncatedSVD
115+
115116
maxDimensions = min(tfidf.shape)-1
116-
svdmodel = TruncatedSVD(n_components=maxDimensions, algorithm="arpack")
117+
118+
svdmodel = TruncatedSVD(n_components=maxDimensions, algorithm="arpack") # The "arpack" algorithm is typically more efficient for large sparse matrices compared to the default "randomized" algorithm. This is particularly important when dealing with high-dimensional data, such as TF-IDF matrices, where the number of features (terms) may be large. SVD is typically computed as an approximation when working with large matrices.
119+
117120
lsa = svdmodel.fit_transform(tfidf)
121+
118122
print(lsa)
119123
```
120124

0 commit comments

Comments
 (0)