Skip to content

Commit b72b0d0

Browse files
abetomootegami
andauthored
Add a reference about pgroonga_language_model_vectorize() (#432)
Co-authored-by: takuya kodama <a.s.takuya1026@gmail.com>
1 parent a9cab2f commit b72b0d0

5 files changed

Lines changed: 295 additions & 0 deletions

File tree

Lines changed: 154 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,154 @@
1+
# Japanese translations for PACKAGE package.
2+
# Copyright (C) 2025 THE PACKAGE'S COPYRIGHT HOLDER
3+
# This file is distributed under the same license as the PACKAGE package.
4+
# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
5+
#
6+
msgid ""
7+
msgstr ""
8+
"Project-Id-Version: PACKAGE VERSION\n"
9+
"Report-Msgid-Bugs-To: \n"
10+
"PO-Revision-Date: 2025-10-16 04:39+0000\n"
11+
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
12+
"Language-Team: Japanese\n"
13+
"Language: ja\n"
14+
"MIME-Version: 1.0\n"
15+
"Content-Type: text/plain; charset=UTF-8\n"
16+
"Content-Transfer-Encoding: 8bit\n"
17+
"Plural-Forms: nplurals=; plural=;\n"
18+
19+
msgid ""
20+
"---\n"
21+
"title: pgroonga_language_model_vectorize function\n"
22+
"upper_level: ../\n"
23+
"---"
24+
msgstr ""
25+
"---\n"
26+
"title: pgroonga_language_model_vectorize 関数\n"
27+
"upper_level: ../\n"
28+
"---"
29+
30+
msgid "# `pgroonga_language_model_vectorize` function"
31+
msgstr "# `pgroonga_language_model_vectorize` 関数"
32+
33+
msgid "Since 4.0.5."
34+
msgstr "4.0.5で追加"
35+
36+
msgid "This is still an experimental feature."
37+
msgstr "まだ実験的な機能です。"
38+
39+
msgid "## Summary"
40+
msgstr "## 概要"
41+
42+
msgid ""
43+
"`pgroonga_language_model_vectorize` function returns a normalized embedding "
44+
"from the given text."
45+
msgstr ""
46+
"`pgroonga_language_model_vectorize`関数は指定されたテキストの正規化されたエン"
47+
"べディングを返します。"
48+
49+
msgid "## Syntax"
50+
msgstr "## 構文"
51+
52+
msgid "Here is the syntax of this function:"
53+
msgstr "この関数の構文は次の通りです。"
54+
55+
msgid ""
56+
"```text\n"
57+
"float4[] pgroonga_language_model_vectorize(model_name, target)\n"
58+
"```"
59+
msgstr ""
60+
61+
msgid ""
62+
"`model_name` is the name of language model to be used. It's `text` type."
63+
msgstr "`model_name`は使用する言語モデルの名前です。型は`text`です。"
64+
65+
msgid ""
66+
"You can specify a Hugging Face URI for `model_name`.\n"
67+
"we recommend using a Hugging Face URI, as it automatically downloads and "
68+
"sets up the model.\n"
69+
"The model is saved in the PostgreSQL database's data directory."
70+
msgstr ""
71+
"`model_name`にはHugging FaceのURIが指定できます。Hugging FaceのURIを指定する"
72+
"とモデルのダウンロードと配置を自動で行うのでおすすめです。モデルはPostgreSQL"
73+
"のデータベースのデータディレクトリに保存されます。"
74+
75+
msgid ""
76+
"The first time you run it, it will also download the model. Therefore, it "
77+
"will take some time.\n"
78+
"For subsequent runs, the local model file that has already been downloaded "
79+
"will be used."
80+
msgstr ""
81+
"初回実行時にはモデルをダウンロードします。そのため時間がかかります。2回目以降"
82+
"は、ダウンロード済のローカルモデルファイルを使うため、ダウンロードの時間はか"
83+
"かりません。"
84+
85+
msgid "`target` is the input text. It's `text` type."
86+
msgstr "`target`は入力テキストです。型は`text`です。"
87+
88+
msgid ""
89+
"`pgroonga_language_model_vectorize` returns a normalized embedding (an array "
90+
"of `float4`)."
91+
msgstr ""
92+
"`pgroonga_language_model_vectorize`は正規化されたエンべディング(`float4`の配"
93+
"列)を返します。"
94+
95+
msgid "## Usage"
96+
msgstr "## 使い方"
97+
98+
msgid ""
99+
"Here is an example of generating embeddings by specifying the Hugging Face "
100+
"URI (`hf:///groonga/all-MiniLM-L6-v2-Q4_K_M-GGUF`)."
101+
msgstr ""
102+
"Hugging FaceのURI(`hf:///groonga/all-MiniLM-L6-v2-Q4_K_M-GGUF`)を指定して、"
103+
"エンベディングを生成する例です。"
104+
105+
msgid ""
106+
"```sql\n"
107+
"CREATE TABLE memos (\n"
108+
" content text\n"
109+
");"
110+
msgstr ""
111+
112+
msgid ""
113+
"INSERT INTO memos VALUES ('I am a king.');\n"
114+
"INSERT INTO memos VALUES ('I am a queen.');"
115+
msgstr ""
116+
117+
msgid ""
118+
"SELECT (pgroonga_language_model_vectorize(\n"
119+
" 'hf:///groonga/all-MiniLM-L6-v2-Q4_K_M-GGUF',\n"
120+
" content))[1:3]\n"
121+
"FROM memos;"
122+
msgstr ""
123+
124+
msgid ""
125+
"-- pgroonga_language_model_vectorize \n"
126+
"-- ----------------------------------------\n"
127+
"-- {-0.027845971,0.04939433,-0.006889274}\n"
128+
"-- {0.088152654,-0.027521685,0.051739622}\n"
129+
"-- (2 rows)\n"
130+
"```"
131+
msgstr ""
132+
133+
msgid ""
134+
"Showing all embeddings would be too long, so only the first three are shown "
135+
"using `[1:3]`."
136+
msgstr ""
137+
"エンベディングのすべてを表示するとかなり長いので `[1:3]` で先頭の3つのみ表示"
138+
"していています。"
139+
140+
msgid "## See also"
141+
msgstr "## 参考"
142+
143+
msgid ""
144+
"* [Groonga's `language_model_vectorize` function][groonga-language-model-"
145+
"vectorize]"
146+
msgstr ""
147+
"* [Groongaの`language_model_vectorize`関数][groonga-language-model-vectorize]"
148+
149+
msgid ""
150+
"[groonga-language-model-vectorize]:https://groonga.org/docs/reference/"
151+
"functions/language_model_vectorize.html"
152+
msgstr ""
153+
"[groonga-language-model-vectorize]:https://groonga.org/ja/docs/reference/"
154+
"functions/language_model_vectorize.html"
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
---
2+
title: pgroonga_language_model_vectorize 関数
3+
upper_level: ../
4+
---
5+
6+
# `pgroonga_language_model_vectorize` 関数
7+
8+
4.0.5で追加
9+
10+
まだ実験的な機能です。
11+
12+
## 概要
13+
14+
`pgroonga_language_model_vectorize`関数は指定されたテキストの正規化されたエンべディングを返します。
15+
16+
## 構文
17+
18+
この関数の構文は次の通りです。
19+
20+
```text
21+
float4[] pgroonga_language_model_vectorize(model_name, target)
22+
```
23+
24+
`model_name`は使用する言語モデルの名前です。型は`text`です。
25+
26+
`model_name`にはHugging FaceのURIが指定できます。Hugging FaceのURIを指定するとモデルのダウンロードと配置を自動で行うのでおすすめです。モデルはPostgreSQLのデータベースのデータディレクトリに保存されます。
27+
28+
初回実行時にはモデルをダウンロードします。そのため時間がかかります。2回目以降は、ダウンロード済のローカルモデルファイルを使うため、ダウンロードの時間はかかりません。
29+
30+
`target`は入力テキストです。型は`text`です。
31+
32+
`pgroonga_language_model_vectorize`は正規化されたエンべディング(`float4`の配列)を返します。
33+
34+
## 使い方
35+
36+
Hugging FaceのURI(`hf:///groonga/all-MiniLM-L6-v2-Q4_K_M-GGUF`)を指定して、エンベディングを生成する例です。
37+
38+
```sql
39+
CREATE TABLE memos (
40+
content text
41+
);
42+
43+
INSERT INTO memos VALUES ('I am a king.');
44+
INSERT INTO memos VALUES ('I am a queen.');
45+
46+
SELECT (pgroonga_language_model_vectorize(
47+
'hf:///groonga/all-MiniLM-L6-v2-Q4_K_M-GGUF',
48+
content))[1:3]
49+
FROM memos;
50+
51+
-- pgroonga_language_model_vectorize
52+
-- ----------------------------------------
53+
-- {-0.027845971,0.04939433,-0.006889274}
54+
-- {0.088152654,-0.027521685,0.051739622}
55+
-- (2 rows)
56+
```
57+
58+
エンベディングのすべてを表示するとかなり長いので `[1:3]` で先頭の3つのみ表示していています。
59+
60+
## 参考
61+
62+
* [Groongaの`language_model_vectorize`関数][groonga-language-model-vectorize]
63+
64+
[groonga-language-model-vectorize]:https://groonga.org/ja/docs/reference/functions/language_model_vectorize.html

ja/reference/index.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -545,6 +545,10 @@ PGroongaは`pgroonga`スキーマに関数・演算子・演算子クラスな
545545

546546
* [`pgroonga_is_writable`関数][is-writable]
547547

548+
* [`pgroonga_language_model_vectorize` function][language-model-vectorize]
549+
550+
* Since 4.0.5.
551+
548552
* [`pgroonga_list_broken_indexes` 関数][list-broken-indexes]
549553

550554
* [`pgroonga_list_lagged_indexes` 関数][list-lagged-indexes]
@@ -768,6 +772,7 @@ PGroongaは`pgroonga`スキーマに関数・演算子・演算子クラスな
768772
[highlight-html]:functions/pgroonga-highlight-html.html
769773
[index-column-name]:functions/pgroonga-index-column-name.html
770774
[is-writable]:functions/pgroonga-is-writable.html
775+
[language-model-vectorize]:functions/pgroonga-language-model-vectorize.html
771776
[list-broken-indexes]:functions/pgroonga-list-broken-indexes.html
772777
[list-lagged-indexes]:functions/pgroonga-list-lagged-indexes.html
773778
[match-positions-byte]:functions/pgroonga-match-positions-byte.html
Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
---
2+
title: pgroonga_language_model_vectorize function
3+
upper_level: ../
4+
---
5+
6+
# `pgroonga_language_model_vectorize` function
7+
8+
Since 4.0.5.
9+
10+
This is still an experimental feature.
11+
12+
## Summary
13+
14+
`pgroonga_language_model_vectorize` function returns a normalized embedding from the given text.
15+
16+
## Syntax
17+
18+
Here is the syntax of this function:
19+
20+
```text
21+
float4[] pgroonga_language_model_vectorize(model_name, target)
22+
```
23+
24+
`model_name` is the name of language model to be used. It's `text` type.
25+
26+
You can specify a Hugging Face URI for `model_name`.
27+
we recommend using a Hugging Face URI, as it automatically downloads and sets up the model.
28+
The model is saved in the PostgreSQL database's data directory.
29+
30+
The first time you run it, it will also download the model. Therefore, it will take some time.
31+
For subsequent runs, the local model file that has already been downloaded will be used.
32+
33+
`target` is the input text. It's `text` type.
34+
35+
`pgroonga_language_model_vectorize` returns a normalized embedding (an array of `float4`).
36+
37+
## Usage
38+
39+
Here is an example of generating embeddings by specifying the Hugging Face URI (`hf:///groonga/all-MiniLM-L6-v2-Q4_K_M-GGUF`).
40+
41+
```sql
42+
CREATE TABLE memos (
43+
content text
44+
);
45+
46+
INSERT INTO memos VALUES ('I am a king.');
47+
INSERT INTO memos VALUES ('I am a queen.');
48+
49+
SELECT (pgroonga_language_model_vectorize(
50+
'hf:///groonga/all-MiniLM-L6-v2-Q4_K_M-GGUF',
51+
content))[1:3]
52+
FROM memos;
53+
54+
-- pgroonga_language_model_vectorize
55+
-- ----------------------------------------
56+
-- {-0.027845971,0.04939433,-0.006889274}
57+
-- {0.088152654,-0.027521685,0.051739622}
58+
-- (2 rows)
59+
```
60+
61+
Showing all embeddings would be too long, so only the first three are shown using `[1:3]`.
62+
63+
## See also
64+
65+
* [Groonga's `language_model_vectorize` function][groonga-language-model-vectorize]
66+
67+
[groonga-language-model-vectorize]:https://groonga.org/docs/reference/functions/language_model_vectorize.html

reference/index.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -543,6 +543,10 @@ Use [`pgroonga_jsonb_ops_v2` operator class](#text-jsonb-ops-v2) instead.
543543

544544
* [`pgroonga_is_writable` function][is-writable]
545545

546+
* [`pgroonga_language_model_vectorize` function][language-model-vectorize]
547+
548+
* Since 4.0.5.
549+
546550
* [`pgroonga_list_broken_indexes` function][list-broken-indexes]
547551

548552
* [`pgroonga_list_lagged_indexes` function][list-lagged-indexes]
@@ -766,6 +770,7 @@ But you need to tune PGroonga in some cases such as a case that you need to hand
766770
[highlight-html]:functions/pgroonga-highlight-html.html
767771
[index-column-name]:functions/pgroonga-index-column-name.html
768772
[is-writable]:functions/pgroonga-is-writable.html
773+
[language-model-vectorize]:functions/pgroonga-language-model-vectorize.html
769774
[list-broken-indexes]:functions/pgroonga-list-broken-indexes.html
770775
[list-lagged-indexes]:functions/pgroonga-list-lagged-indexes.html
771776
[match-positions-byte]:functions/pgroonga-match-positions-byte.html

0 commit comments

Comments
 (0)