Skip to content

Commit 672076a

Browse files
committed
Update the README
1 parent 6b2b549 commit 672076a

1 file changed

Lines changed: 17 additions & 18 deletions

File tree

README.md

Lines changed: 17 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66

77
A Ruby toolkit for managing geospatial metadata, including:
88

9-
- tasks for cloning, updating, and indexing OpenGeoMetdata metadata
9+
- tasks for cloning, updating, and indexing OpenGeoMetadata metadata
1010
- library for converting metadata between standards
1111

1212
## Installation
@@ -19,11 +19,15 @@ gem 'geo_combine'
1919

2020
And then execute:
2121

22-
$ bundle install
22+
```sh
23+
$ bundle install
24+
```
2325

2426
Or install it yourself as:
2527

26-
$ gem install geo_combine
28+
```sh
29+
$ gem install geo_combine
30+
```
2731

2832
## Usage
2933

@@ -71,6 +75,14 @@ GeoCombine::Migrators::V1AardvarkMigrator.new(v1_hash: record, collection_id_map
7175

7276
### OpenGeoMetadata
7377

78+
#### Logging
79+
80+
Some of the tools and scripts in this gem use Ruby's `Logger` class to print information to `$stderr`. By default, the log level is set to `Logger::INFO`. For more verbose information, you can set the `LOG_LEVEL` environment variable to `DEBUG`:
81+
82+
```sh
83+
$ LOG_LEVEL=DEBUG bundle exec rake geocombine:clone
84+
```
85+
7486
#### Clone OpenGeoMetadata repositories locally
7587

7688
```sh
@@ -124,23 +136,14 @@ To index into Solr, GeoCombine requires a Solr instance that is running the
124136
$ bundle exec rake geocombine:index
125137
```
126138

127-
Indexes the `geoblacklight.json` files in cloned repositories to a Solr index running at http://127.0.0.1:8983/solr
128-
129-
##### Custom Solr location
139+
If Blacklight is installed in the ruby environment and a solr index is configured, the rake task will use the solr index configured in the Blacklight application (this is the case when invoking GeoCombine from your GeoBlacklight installation). If Blacklight is unavailable, the rake task will try to find a Solr instance running at `http://localhost:8983/solr/blacklight-core`.
130140

131-
Solr location can also be specified by an environment variable `SOLR_URL`.
141+
You can also set a the Solr instance URL using `SOLR_URL`:
132142

133143
```sh
134144
$ SOLR_URL=http://www.example.com:1234/solr/collection bundle exec rake geocombine:index
135145
```
136146

137-
Depending on your Solr instance's performance characteristics, you may want to
138-
change the [`commitWithin` parameter](https://lucene.apache.org/solr/guide/6_6/updatehandlers-in-solrconfig.html) (in milliseconds):
139-
140-
```sh
141-
$ SOLR_COMMIT_WITHIN=100 bundle exec rake geocombine:index
142-
```
143-
144147
### Harvesting and indexing documents from GeoBlacklight sites
145148

146149
GeoCombine provides a Harvester class and rake task to harvest and index content from GeoBlacklight sites (or any site that follows the Blacklight API format). Given that the configurations can change from consumer to consumer and site to site, the class provides a relatively simple configuration API. This can be configured in an initializer, a wrapping rake task, or any other ruby context where the rake task our class would be invoked.
@@ -186,10 +189,6 @@ Crawl delays can be configured (in seconds) either globally for all sites or on
186189

187190
Solr's commitWithin option can be configured (in milliseconds) by passing a value under the commit_within key.
188191

189-
##### Debugging (default: false)
190-
191-
The harvester and indexer will only `puts` content when errors happen. It is possible to see some progress information by setting the debug configuration option.
192-
193192
#### Transforming Documents
194193

195194
You may need to transform documents that are harvested for various purposes (removing fields, adding fields, omitting a document all together, etc). You can configure some ruby code (a proc) that will take the document in, transform it, and return the transformed document. By default the indexer will remove the `score`, `timestamp`, and `_version_` fields from the documents harvested. If you provide your own transformer, you'll likely want to remove these fields in addition to the other transformations you provide.

0 commit comments

Comments
 (0)