Skip to content

Commit bf0dcbe

Browse files
committed
object-storage: add new 'sourcegraph' bucket
1 parent d2e61aa commit bf0dcbe

1 file changed

Lines changed: 115 additions & 60 deletions

File tree

docs/self-hosted/external-services/object-storage.mdx

Lines changed: 115 additions & 60 deletions
Original file line numberDiff line numberDiff line change
@@ -4,115 +4,170 @@ By default, Sourcegraph will use a `sourcegraph/blobstore` server bundled with t
44

55
You can alternatively configure your instance to instead store this data in an S3 or GCS bucket. Doing so may decrease your hosting costs as persistent volumes are often more expensive than the same storage space in an object store service.
66

7+
## `sourcegraph` bucket
8+
9+
<Callout type="warning">
10+
Starting in Sourcegraph 7.2, self-hosted Sourcegraph instances using S3 or
11+
GCS object storage should now provision an additional bucket named
12+
`sourcegraph`. Sourcegraph currently reports a warning when this bucket is
13+
not present, and it will become required for new features in a future
14+
release. No action is required if you are using the default
15+
`sourcegraph/blobstore`.
16+
</Callout>
17+
18+
The `sourcegraph` bucket is intended to be the single bucket for new Sourcegraph features. Instead of creating one bucket per feature, new features store objects under namespaced key prefixes within this bucket.
19+
20+
Existing buckets for code graph indexes and search jobs remain in use. This change ensures future features can be enabled without requiring a new bucket for each feature.
21+
22+
### Using S3 for the `sourcegraph` bucket
23+
24+
Set the following environment variables to target an S3 bucket for shared Sourcegraph uploads.
25+
26+
- `SOURCEGRAPH_UPLOAD_BACKEND=S3`
27+
- `SOURCEGRAPH_UPLOAD_BUCKET=sourcegraph` (default)
28+
- `SOURCEGRAPH_UPLOAD_AWS_ENDPOINT=https://s3.us-east-1.amazonaws.com`
29+
- `SOURCEGRAPH_UPLOAD_AWS_ACCESS_KEY_ID=<your access key>`
30+
- `SOURCEGRAPH_UPLOAD_AWS_SECRET_ACCESS_KEY=<your secret key>`
31+
- `SOURCEGRAPH_UPLOAD_AWS_SESSION_TOKEN=<your session token>` (optional)
32+
- `SOURCEGRAPH_UPLOAD_AWS_USE_EC2_ROLE_CREDENTIALS=true` (optional; set to use EC2 metadata API over static credentials)
33+
- `SOURCEGRAPH_UPLOAD_AWS_USE_PATH_STYLE=false` (optional)
34+
- `SOURCEGRAPH_UPLOAD_AWS_REGION=us-east-1` (default)
35+
36+
### Using GCS for the `sourcegraph` bucket
37+
38+
Set the following environment variables to target a GCS bucket for shared Sourcegraph uploads.
39+
40+
- `SOURCEGRAPH_UPLOAD_BACKEND=GCS`
41+
- `SOURCEGRAPH_UPLOAD_BUCKET=sourcegraph` (default)
42+
- `SOURCEGRAPH_UPLOAD_GCP_PROJECT_ID=<my project id>`
43+
- `SOURCEGRAPH_UPLOAD_GOOGLE_APPLICATION_CREDENTIALS_FILE=</path/to/file>`
44+
- `SOURCEGRAPH_UPLOAD_GOOGLE_APPLICATION_CREDENTIALS_FILE_CONTENT=<{"my": "content"}>`
45+
46+
### Automatically provision the `sourcegraph` bucket
47+
48+
If you would like to allow your Sourcegraph instance to manage the target bucket configuration, set the following environment variable:
49+
50+
<Callout type="note">
51+
This requires additional bucket-management permissions from your configured
52+
storage vendor (AWS or GCP).
53+
</Callout>
54+
55+
- `SOURCEGRAPH_UPLOAD_MANAGE_BUCKET=true`
56+
757
## Code Graph Indexes
858

959
To target a managed object storage service for storing [code graph index uploads](../../code-navigation/precise-code-navigation), you will need to set a handful of environment variables for configuration and authentication to the target service.
1060

11-
- If you are running a `sourcegraph/server` deployment, set the environment variables on the server container
12-
- If you are running via Docker-compose or Kubernetes, set the environment variables on the `frontend`, `worker`, and `precise-code-intel-worker` containers
61+
- If you are running a `sourcegraph/server` deployment, set the environment variables on the server container
62+
- If you are running via Docker-compose or Kubernetes, set the environment variables on the `frontend`, `worker`, and `precise-code-intel-worker` containers
1363

14-
### Using S3
64+
### Using S3 for the Code Graph Indexes bucket
1565

1666
To target an S3 bucket you've already provisioned, set the following environment variables. Authentication can be done through [an access and secret key pair](https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys) (and optional session token), or via the EC2 metadata API.
1767

1868
<Callout type="warning">
19-
{' '}
20-
Never commit AWS access keys in Git. You should consider using a secret handling
21-
service offered by your cloud provider.{' '}
69+
Never commit AWS access keys in Git. You should consider using a secret
70+
handling service offered by your cloud provider.
2271
</Callout>
2372

24-
- `PRECISE_CODE_INTEL_UPLOAD_BACKEND=S3`
25-
- `PRECISE_CODE_INTEL_UPLOAD_BUCKET=<my bucket name>`
26-
- `PRECISE_CODE_INTEL_UPLOAD_AWS_ENDPOINT=https://s3.us-east-1.amazonaws.com`
27-
- `PRECISE_CODE_INTEL_UPLOAD_AWS_ACCESS_KEY_ID=<your access key>`
28-
- `PRECISE_CODE_INTEL_UPLOAD_AWS_SECRET_ACCESS_KEY=<your secret key>`
29-
- `PRECISE_CODE_INTEL_UPLOAD_AWS_SESSION_TOKEN=<your session token>` (optional)
30-
- `PRECISE_CODE_INTEL_UPLOAD_AWS_USE_EC2_ROLE_CREDENTIALS=true` (optional; set to use EC2 metadata API over static credentials)
31-
- `PRECISE_CODE_INTEL_UPLOAD_AWS_REGION=us-east-1` (default)
73+
- `PRECISE_CODE_INTEL_UPLOAD_BACKEND=S3`
74+
- `PRECISE_CODE_INTEL_UPLOAD_BUCKET=<my bucket name>`
75+
- `PRECISE_CODE_INTEL_UPLOAD_AWS_ENDPOINT=https://s3.us-east-1.amazonaws.com`
76+
- `PRECISE_CODE_INTEL_UPLOAD_AWS_ACCESS_KEY_ID=<your access key>`
77+
- `PRECISE_CODE_INTEL_UPLOAD_AWS_SECRET_ACCESS_KEY=<your secret key>`
78+
- `PRECISE_CODE_INTEL_UPLOAD_AWS_SESSION_TOKEN=<your session token>` (optional)
79+
- `PRECISE_CODE_INTEL_UPLOAD_AWS_USE_EC2_ROLE_CREDENTIALS=true` (optional; set to use EC2 metadata API over static credentials)
80+
- `PRECISE_CODE_INTEL_UPLOAD_AWS_REGION=us-east-1` (default)
3281

3382
<Callout type="note">
34-
{' '}
35-
If a non-default region is supplied, ensure that the subdomain of the endpoint
36-
URL (_the `AWS_ENDPOINT` value_) matches the target region.{' '}
83+
If a non-default region is supplied, ensure that the subdomain of the
84+
endpoint URL (_the `AWS_ENDPOINT` value_) matches the target region.
3785
</Callout>
3886

3987
<Callout type="tip">
40-
{' '}
41-
You don't need to set the `PRECISE_CODE_INTEL_UPLOAD_AWS_ACCESS_KEY_ID` environment
42-
variable when using `PRECISE_CODE_INTEL_UPLOAD_AWS_USE_EC2_ROLE_CREDENTIALS=true`
43-
because role credentials will be automatically resolved. Attach the IAM role
44-
to the EC2 instances hosting the `frontend`, `worker`, and `precise-code-intel-worker`
45-
containers in a multi-node environment.{' '}
88+
You don't need to set the `PRECISE_CODE_INTEL_UPLOAD_AWS_ACCESS_KEY_ID`
89+
environment variable when using
90+
`PRECISE_CODE_INTEL_UPLOAD_AWS_USE_EC2_ROLE_CREDENTIALS=true` because role
91+
credentials will be automatically resolved. Attach the IAM role to the EC2
92+
instances hosting the `frontend`, `worker`, and `precise-code-intel-worker`
93+
containers in a multi-node environment.
4694
</Callout>
4795

48-
### Using GCS
96+
### Using GCS for the Code Graph Indexes bucket
4997

5098
To target a GCS bucket you've already provisioned, set the following environment variables. Authentication is done through a service account key, supplied as either a path to a volume-mounted file, or the contents read in as an environment variable payload.
5199

52-
- `PRECISE_CODE_INTEL_UPLOAD_BACKEND=GCS`
53-
- `PRECISE_CODE_INTEL_UPLOAD_BUCKET=<my bucket name>`
54-
- `PRECISE_CODE_INTEL_UPLOAD_GCP_PROJECT_ID=<my project id>`
55-
- `PRECISE_CODE_INTEL_UPLOAD_GOOGLE_APPLICATION_CREDENTIALS_FILE=</path/to/file>`
56-
- `PRECISE_CODE_INTEL_UPLOAD_GOOGLE_APPLICATION_CREDENTIALS_FILE_CONTENT=<{"my": "content"}>`
100+
- `PRECISE_CODE_INTEL_UPLOAD_BACKEND=GCS`
101+
- `PRECISE_CODE_INTEL_UPLOAD_BUCKET=<my bucket name>`
102+
- `PRECISE_CODE_INTEL_UPLOAD_GCP_PROJECT_ID=<my project id>`
103+
- `PRECISE_CODE_INTEL_UPLOAD_GOOGLE_APPLICATION_CREDENTIALS_FILE=</path/to/file>`
104+
- `PRECISE_CODE_INTEL_UPLOAD_GOOGLE_APPLICATION_CREDENTIALS_FILE_CONTENT=<{"my": "content"}>`
57105

58-
### Provisioning buckets
106+
### Automatically provision the Code Graph Indexes bucket
59107

60108
If you would like to allow your Sourcegraph instance to control the creation and lifecycle configuration management of the target buckets, set the following environment variables:
61109

62-
- `PRECISE_CODE_INTEL_UPLOAD_MANAGE_BUCKET=true`
63-
- `PRECISE_CODE_INTEL_UPLOAD_TTL=168h` (default)
110+
<Callout type="note">
111+
This requires additional bucket-management permissions from your configured
112+
storage vendor (AWS or GCP).
113+
</Callout>
114+
115+
- `PRECISE_CODE_INTEL_UPLOAD_MANAGE_BUCKET=true`
116+
- `PRECISE_CODE_INTEL_UPLOAD_TTL=168h` (default)
64117

65118
## Search Job Results
66119

67120
To target a third party managed object storage service for storing [search job results](../../code-search/types/search-jobs), you must set a handful of environment variables for configuration and authentication to the target service.
68121

69-
- If you are running a `sourcegraph/server` deployment, set the environment variables on the server container
70-
- If you are running via Docker-compose or Kubernetes, set the environment variables on the `frontend` and `worker` containers
122+
- If you are running a `sourcegraph/server` deployment, set the environment variables on the server container
123+
- If you are running via Docker-compose or Kubernetes, set the environment variables on the `frontend` and `worker` containers
71124

72-
### Using S3
125+
### Using S3 for the Search Job Results bucket
73126

74127
Set the following environment variables to target an S3 bucket you've already provisioned. Authentication can be done through [an access and secret key pair](https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys) (and optionally through session token) or via the EC2 metadata API.
75128

76129
<Callout type="warning">
77-
{' '}
78-
Never commit AWS access keys in Git. You should consider using a secret handling
79-
service offered by your cloud provider.
130+
Never commit AWS access keys in Git. You should consider using a secret
131+
handling service offered by your cloud provider.
80132
</Callout>
81133

82-
- `SEARCH_JOBS_UPLOAD_BACKEND=S3`
83-
- `SEARCH_JOBS_UPLOAD_BUCKET=<my bucket name>`
84-
- `SEARCH_JOBS_UPLOAD_AWS_ENDPOINT=https://s3.us-east-1.amazonaws.com`
85-
- `SEARCH_JOBS_UPLOAD_AWS_ACCESS_KEY_ID=<your access key>`
86-
- `SEARCH_JOBS_UPLOAD_AWS_SECRET_ACCESS_KEY=<your secret key>`
87-
- `SEARCH_JOBS_UPLOAD_AWS_SESSION_TOKEN=<your session token>` (optional)
88-
- `SEARCH_JOBS_UPLOAD_AWS_USE_EC2_ROLE_CREDENTIALS=true` (optional; set to use EC2 metadata API over static credentials)
89-
- `SEARCH_JOBS_UPLOAD_AWS_REGION=us-east-1` (default)
134+
- `SEARCH_JOBS_UPLOAD_BACKEND=S3`
135+
- `SEARCH_JOBS_UPLOAD_BUCKET=<my bucket name>`
136+
- `SEARCH_JOBS_UPLOAD_AWS_ENDPOINT=https://s3.us-east-1.amazonaws.com`
137+
- `SEARCH_JOBS_UPLOAD_AWS_ACCESS_KEY_ID=<your access key>`
138+
- `SEARCH_JOBS_UPLOAD_AWS_SECRET_ACCESS_KEY=<your secret key>`
139+
- `SEARCH_JOBS_UPLOAD_AWS_SESSION_TOKEN=<your session token>` (optional)
140+
- `SEARCH_JOBS_UPLOAD_AWS_USE_EC2_ROLE_CREDENTIALS=true` (optional; set to use EC2 metadata API over static credentials)
141+
- `SEARCH_JOBS_UPLOAD_AWS_REGION=us-east-1` (default)
90142

91143
<Callout type="note">
92-
{' '}
93-
If a non-default region is supplied, ensure that the subdomain of the endpoint
94-
URL (the `AWS_ENDPOINT` value) matches the target region.
144+
If a non-default region is supplied, ensure that the subdomain of the
145+
endpoint URL (the `AWS_ENDPOINT` value) matches the target region.
95146
</Callout>
96147

97148
<Callout type="tip">
98-
{' '}
99149
You don't need to set the `SEARCH_JOBS_UPLOAD_AWS_ACCESS_KEY_ID` environment
100-
variable when using `SEARCH_JOBS_UPLOAD_AWS_USE_EC2_ROLE_CREDENTIALS=true` because
101-
role credentials will be automatically resolved.
150+
variable when using `SEARCH_JOBS_UPLOAD_AWS_USE_EC2_ROLE_CREDENTIALS=true`
151+
because role credentials will be automatically resolved.
102152
</Callout>
103153

104-
### Using GCS
154+
### Using GCS for the Search Job Results bucket
105155

106156
Set the following environment variables to target a GCS bucket you've already provisioned. Authentication is done through a service account key, either as a path to a volume-mounted file or the contents read in as an environment variable payload.
107157

108-
- `SEARCH_JOBS_UPLOAD_BACKEND=GCS`
109-
- `SEARCH_JOBS_UPLOAD_BUCKET=<my bucket name>`
110-
- `SEARCH_JOBS_UPLOAD_GCP_PROJECT_ID=<my project id>`
111-
- `SEARCH_JOBS_UPLOAD_GOOGLE_APPLICATION_CREDENTIALS_FILE=</path/to/file>`
112-
- `SEARCH_JOBS_UPLOAD_GOOGLE_APPLICATION_CREDENTIALS_FILE_CONTENT=<{"my": "content"}>`
158+
- `SEARCH_JOBS_UPLOAD_BACKEND=GCS`
159+
- `SEARCH_JOBS_UPLOAD_BUCKET=<my bucket name>`
160+
- `SEARCH_JOBS_UPLOAD_GCP_PROJECT_ID=<my project id>`
161+
- `SEARCH_JOBS_UPLOAD_GOOGLE_APPLICATION_CREDENTIALS_FILE=</path/to/file>`
162+
- `SEARCH_JOBS_UPLOAD_GOOGLE_APPLICATION_CREDENTIALS_FILE_CONTENT=<{"my": "content"}>`
113163

114-
### Provisioning buckets
164+
### Automatically provision the Search Job Results bucket
115165

116166
If you would like to allow your Sourcegraph instance to control the creation and lifecycle configuration management of the target buckets, set the following environment variables:
117167

118-
- `SEARCH_JOBS_UPLOAD_MANAGE_BUCKET=true`
168+
<Callout type="note">
169+
This requires additional bucket-management permissions from your configured
170+
storage vendor (AWS or GCP).
171+
</Callout>
172+
173+
- `SEARCH_JOBS_UPLOAD_MANAGE_BUCKET=true`

0 commit comments

Comments
 (0)