Skip to content

Commit 16801f7

Browse files
committed
Merge branch 'main' of github.com:Blosc/python-blosc2
2 parents 7c08eff + 7f3da02 commit 16801f7

17 files changed

Lines changed: 735 additions & 100 deletions

CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ endif()
6363

6464
FetchContent_Declare(miniexpr
6565
GIT_REPOSITORY https://github.com/Blosc/miniexpr.git
66-
GIT_TAG 320240a849e185c84114501200052bfeb8d66f2b
66+
GIT_TAG fef636f8f2dbefb351b34773a4249472559ca8a5 # v0.2.0
6767
# SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR}/../miniexpr
6868
)
6969
FetchContent_MakeAvailable(miniexpr)

bench/ndarray/linear-constructor.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212

1313
import blosc2
1414

15-
dtype = np.int64
15+
dtype = np.float64
1616
shape = (10_000, 10_000)
1717
start, stop = 1, 2
1818
cparams = blosc2.CParams(codec=blosc2.Codec.BLOSCLZ, clevel=1)

bench/ndarray/sum-linear-idx.py

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -13,8 +13,9 @@
1313

1414
import blosc2
1515

16-
dtype = np.float64
16+
dtype = np.int64
1717
shape = (10_000, 10_000)
18+
cparams = blosc2.CParams(codec=blosc2.Codec.BLOSCLZ, clevel=1)
1819

1920
@blosc2.dsl_kernel
2021
def kernel_ramp():
@@ -23,25 +24,30 @@ def kernel_ramp():
2324

2425
print(kernel_ramp.dsl_source)
2526
a = blosc2.lazyudf(kernel_ramp, (), dtype=dtype, shape=shape)
26-
npa = a.compute(cparams=dict(clevel=1, codec=blosc2.Codec.LZ4))
27+
npa = a.compute(cparams=cparams)
2728
t0 = time()
2829
result = npa.sum()
2930
# print(result)
3031
print("Blosc2 sum over NDArray:", round(time() - t0, 3), "s")
3132

3233
t0 = time()
3334
a = blosc2.lazyudf(kernel_ramp, (), dtype=dtype, shape=shape)
34-
result = a.sum()
35+
result = a.sum(cparams=cparams)
3536
# print(result)
3637
print("Blosc2 sum over LazyArray:", round(time() - t0, 3), "s")
3738

3839
t0 = time()
3940
a = blosc2.lazyudf(kernel_ramp, (), dtype=dtype, shape=shape)
40-
# result = a.compute(cparams=dict(clevel=1, codec=blosc2.Codec.LZ4)).sum()
41-
result = a.compute().sum()
41+
result = a.compute(cparams=cparams).sum()
4242
# print(result)
4343
print("(with a prior .compute):", round(time() - t0, 3), "s")
4444

45+
t0 = time()
46+
a = blosc2.arange(np.prod(shape), dtype=dtype, shape=shape, cparams=cparams)
47+
result = a.sum()
48+
# print(result)
49+
print("Blosc2 arange + sum:", round(time() - t0, 3), "s")
50+
4551
t0 = time()
4652
npa = np.arange(np.prod(shape), dtype=dtype).reshape(shape)
4753
result = npa.sum()

doc/getting_started/dsl_syntax.md

Lines changed: 237 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,237 @@
1+
# miniexpr DSL Syntax (Canonical Reference)
2+
3+
This is the practical reference for the DSL accepted by `me_compile()`.
4+
It focuses on what works today and the most common gotchas.
5+
For usage walkthroughs and end-to-end examples, see the
6+
[LazyArray UDF DSL kernels tutorial](tutorials/03.lazyarray-udf-kernels.ipynb).
7+
8+
## Quick start
9+
10+
A valid DSL program is one function:
11+
12+
```python
13+
def kernel(x, y):
14+
temp = sin(x) ** 2
15+
return temp + cos(y) ** 2
16+
```
17+
18+
Use Python-style indentation and always return a value on the paths you execute.
19+
20+
## Program shape
21+
22+
- Exactly one top-level `def ...:` function is expected.
23+
- Leading blank lines and header comments are allowed.
24+
- Any extra trailing content after the function is a parse error.
25+
- Nested `def` inside the function body is not allowed.
26+
27+
## Header pragmas
28+
29+
Supported file-header pragmas:
30+
31+
- `# me:fp=strict|contract|fast`
32+
- `# me:compiler=tcc|cc`
33+
34+
Notes:
35+
36+
- Pragma keys must be unique.
37+
- Unknown `me:*` pragmas are errors.
38+
- Malformed pragma values are errors.
39+
40+
## Function signature and inputs
41+
42+
- Parameters are positional names: `def kernel(a, b, c): ...`
43+
- Parameter names must be unique.
44+
- At compile time, DSL parameter names must match input variable names by set membership
45+
(order may differ, count must match).
46+
47+
## Statements
48+
49+
Supported statement forms:
50+
51+
- Assignment: `a = expr`
52+
- Compound assignment: `+=`, `-=`, `*=`, `/=`, `//=`
53+
- Expression statement: `expr`
54+
- Return: `return expr`
55+
- Print: `print(...)`
56+
- Conditionals: `if` / `elif` / `else`
57+
- While loop: `while cond:`
58+
- For loop: `for i in range(...):`
59+
- Loop control: `break`, `continue`
60+
61+
General rules:
62+
63+
- Python-style indentation is required.
64+
- Empty blocks are invalid.
65+
- `elif`/`else` must belong to a matching `if`.
66+
- Deprecated forms like `break if cond` / `continue if cond` are not part of DSL syntax.
67+
68+
### `if` / `elif` / `else` example
69+
70+
```python
71+
def kernel(x):
72+
if x > 0:
73+
y = x
74+
elif x == 0:
75+
y = 1
76+
else:
77+
y = -x
78+
return y
79+
```
80+
81+
### `for` example
82+
83+
```python
84+
def kernel(n):
85+
acc = 0
86+
for i in range(0, n, 1):
87+
acc += i
88+
return acc
89+
```
90+
91+
### `while` example
92+
93+
```python
94+
def kernel(x):
95+
i = 0
96+
y = x
97+
while i < 3:
98+
y = y * 2
99+
i += 1
100+
return y
101+
```
102+
103+
## Expressions and function calls
104+
105+
Expressions are compiled by miniexpr with DSL checks.
106+
107+
Commonly supported:
108+
109+
- Names and numeric constants
110+
- Unary operators: `+`, `-`, logical not (`not` / `!`)
111+
- Arithmetic and bitwise binary operators
112+
- Comparisons: `==`, `!=`, `<`, `<=`, `>`, `>=`
113+
- Function calls to supported miniexpr functions
114+
- User-registered C functions/closures passed in `me_variable`
115+
116+
Cast intrinsics:
117+
118+
- `int(expr)`
119+
- `float(expr)`
120+
- `bool(expr)`
121+
122+
Cast rules:
123+
124+
- Use function-call form only.
125+
- Exactly one argument.
126+
127+
## Temporary variable type inference
128+
129+
Local temporaries get their dtype from the expression assigned to them.
130+
131+
Example:
132+
133+
```python
134+
def kernel(x):
135+
temp = sin(x) ** 2
136+
return temp + cos(x) ** 2
137+
```
138+
139+
In this example, `temp` is inferred from `sin(x) ** 2` (typically a floating type).
140+
141+
Notes:
142+
143+
- You do not need to declare local variable types.
144+
- If you assign a value with an incompatible dtype to the same local later, compilation fails.
145+
146+
## Loops
147+
148+
### `for ... in range(...)`
149+
150+
Supported forms:
151+
152+
```python
153+
for i in range(stop):
154+
...
155+
for i in range(start, stop):
156+
...
157+
for i in range(start, stop, step):
158+
...
159+
```
160+
161+
Rules:
162+
163+
- `range` takes 1, 2, or 3 arguments.
164+
- `step == 0` raises a runtime evaluation error.
165+
166+
### `while`
167+
168+
- `while` condition is a regular DSL expression.
169+
- Runtime iteration cap is enforced by `ME_DSL_WHILE_MAX_ITERS`.
170+
171+
## `print(...)`
172+
173+
`print` is supported as a DSL statement.
174+
175+
Rules:
176+
177+
- At least one argument is required.
178+
- First argument may be a format string.
179+
- Placeholder count must match provided values.
180+
- Printed expressions must be uniform/scalar for the block.
181+
182+
## Reserved names
183+
184+
Do not use these as user variable/function names in DSL:
185+
186+
- `print`, `int`, `float`, `bool`, `def`, `return`
187+
- `_ndim`
188+
- `_i<d>` and `_n<d>` (reserved ND symbols)
189+
- `_flat_idx`
190+
191+
## ND reserved symbols
192+
193+
When referenced, these are synthesized by DSL compiler/runtime:
194+
195+
- `_i0`, `_i1`, ... (index per dimension)
196+
- `_n0`, `_n1`, ... (shape per dimension)
197+
- `_ndim`
198+
- `_flat_idx` (global C-order linear index)
199+
200+
## Typing and return behavior
201+
202+
- Reassigning incompatible dtypes to the same local is a compile-time error.
203+
- Return dtype must be consistent across all `return` statements.
204+
- Non-guaranteed return paths may compile; if execution reaches a missing return path, evaluation fails at runtime.
205+
206+
## Compound assignment desugaring
207+
208+
- `a += b` -> `a = a + b`
209+
- `a -= b` -> `a = a - b`
210+
- `a *= b` -> `a = a * b`
211+
- `a /= b` -> `a = a / b`
212+
- `a //= b` -> `a = floor(a / b)`
213+
214+
## Compile-time vs runtime errors
215+
216+
Compile-time error examples:
217+
218+
- Invalid program shape or signature
219+
- Unsupported statement forms
220+
- Invalid `range(...)` arity
221+
- Invalid cast intrinsic arity
222+
- Reserved-name misuse
223+
- Return dtype mismatch
224+
225+
Runtime error examples:
226+
227+
- `range(..., step=0)`
228+
- Missing return on executed control path
229+
- While-loop iteration cap exceeded
230+
231+
## Python syntax that is out of DSL scope
232+
233+
These Python features are not part of this DSL:
234+
235+
- Ternary expression: `a if cond else b`
236+
- `for ... else` and `while ... else`
237+
- Keyword-argument calls and other call forms outside the supported subset

doc/getting_started/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,3 +7,4 @@ Getting Started
77
overview
88
installation
99
tutorials
10+
dsl_syntax

0 commit comments

Comments
 (0)