Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,8 @@ tracebloc dataset push ./my-data \
--label-column label
```

> **`tracebloc: command not found` after installing?** The binary installs to `~/.local/bin` when `/usr/local/bin` isn't writable, and an already-running shell won't see the new PATH entry until you open a new terminal (or `. ~/.bashrc`). See **[Troubleshooting installation](docs/troubleshooting.md)**.

What that runs under the curtain:

1. Reads kubeconfig, discovers the parent `tracebloc/client` release in the cluster
Expand Down
110 changes: 110 additions & 0 deletions docs/troubleshooting.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
# Troubleshooting

## `tracebloc: command not found` after install

The installer downloads a single binary and drops it in a `bin` directory. If
that directory isn't on your shell's `PATH`, the `tracebloc` command won't be
found even though the binary is installed.

### Why it happens

`install.sh` installs to `/usr/local/bin` when that's writable. The usual
unprivileged one-liner (`curl … | sh`) **can't** write there, so it falls back
to **`$HOME/.local/bin`** — which isn't on `PATH` in many setups.

The installer adds `$HOME/.local/bin` to the rc file your shell actually reads
(`~/.bashrc`, `~/.zshrc`, `~/.bash_profile` on macOS, `~/.config/fish/config.fish`,
or `~/.profile`), but a shell that's **already running** won't see the change —
you need a new shell, or to re-load the rc file.

On Linux desktops there's an extra trap: the stock `~/.profile` only adds
`~/.local/bin` to `PATH` **at login**, and only if the directory already existed
at login time. The installer creates it mid-session, and a new terminal window
is an interactive *non-login* shell that reads `~/.bashrc` (not `~/.profile`) —
so "open a new terminal" alone never triggers the `~/.profile` logic. That's why
the installer writes to `~/.bashrc` directly.

### Fix it

**1. Confirm it's only a PATH problem** — run the binary by its full path:

```sh
~/.local/bin/tracebloc version # or: /usr/local/bin/tracebloc version
```

If that prints a version, the install is fine and this is purely `PATH`.

**2. Pick up the PATH entry the installer added** — open a **new** terminal, or
re-load your rc file in the current one:

```sh
. ~/.bashrc # bash on Linux
. ~/.zshrc # zsh (macOS default)
. ~/.bash_profile # bash on macOS
```

Then:

```sh
tracebloc version
```

**3. If it's still not found**, check which shell you're in and that the entry
landed in the matching rc file:

```sh
echo "$SHELL"
grep -n '.local/bin' ~/.bashrc ~/.zshrc ~/.bash_profile ~/.profile 2>/dev/null
```

If the line is in the wrong file (e.g. `~/.bashrc` while you actually run zsh, or
`~/.bashrc` on macOS where Terminal opens a login shell that reads
`~/.bash_profile`), add it to the right one:

```sh
echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.zshrc # adjust file to your shell
. ~/.zshrc
```

### Cleanest alternative: install where PATH already points

`/usr/local/bin` is on `PATH` for both login and non-login shells out of the box
on Linux and macOS. Installing there sidesteps the whole rc/login-shell question:

```sh
# Move the binary you already have:
sudo mv ~/.local/bin/tracebloc /usr/local/bin/

# …or re-run the installer with write access to /usr/local/bin:
curl -fsSL https://github.com/tracebloc/cli/releases/latest/download/install.sh | sudo sh
```

### Windows (PowerShell)

`install.ps1` installs to `%LOCALAPPDATA%\Programs\tracebloc` and adds it to your
**user** `PATH` automatically. The current PowerShell session won't refresh —
**open a new PowerShell window**, then:

```powershell
tracebloc version
```

If it's still missing, confirm the entry and check nothing at Process/Machine
scope overrides it:

```powershell
[Environment]::GetEnvironmentVariable('Path','User') -split ';' | Select-String tracebloc
```

## Server / SSH sessions

Each SSH login is a *login* shell, so it reads `~/.profile` (or `~/.bash_profile`
/ `~/.bash_login` if one exists — bash reads only the **first** that's present).
If you have a `~/.bash_profile` that doesn't source `~/.profile` or `~/.bashrc`,
the PATH entry may be skipped. Either add the `export PATH=…` line to the file
your login shell actually reads, or use the `/usr/local/bin` approach above.

## Still stuck?

Open an issue at [github.com/tracebloc/cli/issues](https://github.com/tracebloc/cli/issues)
with your OS, shell (`echo "$SHELL"`), and the output of `ls -l ~/.local/bin/tracebloc`.
7 changes: 6 additions & 1 deletion internal/cli/dataset.go
Original file line number Diff line number Diff line change
Expand Up @@ -483,7 +483,7 @@ contributors train against it without ever seeing the raw files.`))
}
a.Spec.Schema = sch
} else {
sch, skipped, ierr := push.InferSchema(layout.LabelsCSV)
sch, skipped, empty, ierr := push.InferSchema(layout.LabelsCSV)
if ierr != nil {
return &exitError{code: 3, err: fmt.Errorf("inferring schema from CSV: %w", ierr)}
}
Expand All @@ -495,6 +495,11 @@ contributors train against it without ever seeing the raw files.`))
_, _ = fmt.Fprintf(out,
" (skipped framework-managed column(s): %s)\n", strings.Join(skipped, ", "))
}
if len(empty) > 0 {
_, _ = fmt.Fprintf(out,
" (warning: %d column(s) had no values in the sample and were typed FLOAT (nullable): %s)\n",
len(empty), strings.Join(empty, ", "))
}
}
case push.IsImage(a.Spec.Category):
// keypoint_detection needs --number-of-keypoints (dataset-
Expand Down
28 changes: 20 additions & 8 deletions internal/push/tabular.go
Original file line number Diff line number Diff line change
Expand Up @@ -145,8 +145,10 @@ func ParseSchema(s string) (map[string]string, error) {
// InferSchema reads the CSV header and a sample of rows and infers a
// column→SQL-type map: all-integer columns → INT, otherwise
// all-numeric → FLOAT, otherwise VARCHAR(255). Empty cells are
// ignored when judging a column; a column with no non-empty sampled
// value falls back to VARCHAR(255).
// ignored when judging a column; a column with NO non-empty sampled
// value is typed as a nullable FLOAT (not VARCHAR — an all-NULL VARCHAR
// is exactly what the ingestor's string validator rejects) and returned
// in `empty` so the caller can warn.
//
// It's a convenience so customers don't hand-write a --schema for the
// common case. Non-numeric specials (timestamps, dates, booleans)
Expand All @@ -155,20 +157,20 @@ func ParseSchema(s string) (map[string]string, error) {
// Framework-managed columns (see reservedColumns — id, data_id, …)
// are skipped and returned as the second value so the caller can tell
// the customer they weren't included.
func InferSchema(csvPath string) (schema map[string]string, skipped []string, err error) {
func InferSchema(csvPath string) (schema map[string]string, skipped, empty []string, err error) {
f, err := os.Open(csvPath)
if err != nil {
return nil, nil, err
return nil, nil, nil, err
}
defer func() { _ = f.Close() }()

r := csv.NewReader(f)
header, err := r.Read()
if err != nil {
return nil, nil, fmt.Errorf("reading CSV header from %s: %w", csvPath, err)
return nil, nil, nil, fmt.Errorf("reading CSV header from %s: %w", csvPath, err)
}
if len(header) == 0 {
return nil, nil, fmt.Errorf("CSV %s has no columns", csvPath)
return nil, nil, nil, fmt.Errorf("CSV %s has no columns", csvPath)
}

// Per-column running judgement.
Expand All @@ -186,7 +188,7 @@ func InferSchema(csvPath string) (schema map[string]string, skipped []string, er
break
}
if err != nil {
return nil, nil, fmt.Errorf("reading CSV row from %s: %w", csvPath, err)
return nil, nil, nil, fmt.Errorf("reading CSV row from %s: %w", csvPath, err)
}
for i := 0; i < len(header) && i < len(row); i++ {
v := strings.TrimSpace(row[i])
Expand Down Expand Up @@ -221,9 +223,19 @@ func InferSchema(csvPath string) (schema map[string]string, skipped []string, er
schema[col] = "INT"
case sawValue[i] && couldBeFloat[i]:
schema[col] = "FLOAT"
case !sawValue[i]:
// Entirely empty in the sample (e.g. an unmeasured analyte in a
// sparse panel). It can't be typed from data; default to a
// nullable FLOAT rather than VARCHAR — a tabular feature column
// is numeric far more often than text, and an all-NULL VARCHAR
// is exactly the shape the ingestor's string validator rejects.
// Reported in `empty` so the caller can warn / the user can
// --schema-override.
schema[col] = "FLOAT"
empty = append(empty, col)
default:
schema[col] = "VARCHAR(255)"
}
}
return schema, skipped, nil
return schema, skipped, empty, nil
}
22 changes: 13 additions & 9 deletions internal/push/tabular_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ func TestInferSchema(t *testing.T) {
csv := writeFile(t, dir, "data.csv",
"count,age,price,name\n1,30,9.99,alice\n2,40,19.5,bob\n")

schema, _, err := InferSchema(csv)
schema, _, _, err := InferSchema(csv)
if err != nil {
t.Fatalf("InferSchema: %v", err)
}
Expand All @@ -83,22 +83,26 @@ func TestInferSchema(t *testing.T) {
}
}

// TestInferSchema_EmptyColumnIsVarchar: a column with no non-empty
// sampled value can't be typed, so it falls back to VARCHAR(255)
// rather than being mislabeled INT/FLOAT.
func TestInferSchema_EmptyColumnIsVarchar(t *testing.T) {
// TestInferSchema_EmptyColumnIsFloat: a column with no non-empty sampled
// value can't be typed from data; it's returned as a nullable FLOAT (not
// VARCHAR — an all-NULL VARCHAR is what the ingestor's string validator
// rejects) and reported in the `empty` return so the caller can warn.
func TestInferSchema_EmptyColumnIsFloat(t *testing.T) {
dir := t.TempDir()
csv := writeFile(t, dir, "data.csv", "filled,empty\n1,\n2,\n")
schema, _, err := InferSchema(csv)
schema, _, empty, err := InferSchema(csv)
if err != nil {
t.Fatalf("InferSchema: %v", err)
}
if schema["empty"] != "VARCHAR(255)" {
t.Errorf("schema[empty] = %q, want VARCHAR(255)", schema["empty"])
if schema["empty"] != "FLOAT" {
t.Errorf("schema[empty] = %q, want FLOAT", schema["empty"])
}
if schema["filled"] != "INT" {
t.Errorf("schema[filled] = %q, want INT", schema["filled"])
}
if len(empty) != 1 || empty[0] != "empty" {
t.Errorf("empty = %v, want [empty]", empty)
}
}

// TestInferSchema_SkipsReservedColumns: a CSV with an `id` (or other
Expand All @@ -109,7 +113,7 @@ func TestInferSchema_SkipsReservedColumns(t *testing.T) {
dir := t.TempDir()
csv := writeFile(t, dir, "data.csv", "id,feature_00,label\n1,1.5,0\n2,2.5,1\n")

schema, skipped, err := InferSchema(csv)
schema, skipped, _, err := InferSchema(csv)
if err != nil {
t.Fatalf("InferSchema: %v", err)
}
Expand Down
91 changes: 83 additions & 8 deletions scripts/install.sh
Original file line number Diff line number Diff line change
Expand Up @@ -259,16 +259,91 @@ echo "Verify with:"
echo " $PREFIX/$BINARY_NAME version"
echo ""

# PATH guidance for the fallback case.
case ":$PATH:" in
*":$PREFIX:"*) ;; # already on PATH
*)
echo "Note: $PREFIX is not on \$PATH. Add this to your shell rc file:"
echo " export PATH=\"\$PATH:$PREFIX\""
echo ""
;;
# PATH handling. install.ps1 persists the PATH entry on Windows
# (SetEnvironmentVariable, User scope) — do the same on Unix by writing to
# the rc file the user's shell actually reads. The old print-only advice
# silently failed on Ubuntu: ~/.profile adds ~/.local/bin only at *login*
# and only if it already existed, but the installer creates it mid-session,
# so a new (non-login) terminal reading ~/.bashrc never picks it up.
#
# Decide whether to persist a PATH entry to the user's shell rc:
# - a user-local prefix (under $HOME — the unprivileged `curl | sh` fallback)
# ALWAYS needs one, even when a one-off `export` already put it on $PATH for
# this shell (that won't survive into a new terminal);
# - a non-$HOME prefix that ISN'T on $PATH (e.g. `--prefix /opt/tracebloc`)
# also needs one;
# - a non-$HOME prefix already on $PATH (e.g. the default /usr/local/bin) is
# on PATH for every shell and needs nothing.
# ($HOME is stripped of a trailing slash first so a HOME like "/home/u/" can't
# misclassify "/home/u/.local/bin" via a "/home/u//*" pattern it won't match.)
home_dir="${HOME%/}"
persist=no
case "$PREFIX" in
"$home_dir"/*) persist=yes ;;
*) case ":$PATH:" in *":$PREFIX:"*) ;; *) persist=yes ;; esac ;;
esac

if [ "$persist" = "yes" ]; then
shell_name="$(basename "${SHELL:-sh}")"
case "$shell_name" in
zsh) rc="$HOME/.zshrc" ;;
bash)
# macOS Terminal opens a login shell (reads .bash_profile);
# Linux terminals are interactive non-login (read .bashrc).
if [ "$OS" = "darwin" ]; then rc="$HOME/.bash_profile"; else rc="$HOME/.bashrc"; fi
;;
fish) rc="$HOME/.config/fish/config.fish" ;;
*) rc="$HOME/.profile" ;;
esac

if [ "$shell_name" = "fish" ]; then
path_line="fish_add_path $PREFIX"
else
path_line="export PATH=\"$PREFIX:\$PATH\""
fi

# Track three outcomes precisely so the message can neither over- nor
# under-claim: already configured / freshly added / couldn't write.
state=failed
mkdir -p "$(dirname "$rc")" 2>/dev/null || true
# Idempotency: only an actual, non-comment PATH op that references $PREFIX
# counts as "already configured" — a bare comment or an unrelated line that
# merely mentions the dir must NOT pass, or we'd claim success while a new
# shell still can't find the binary (#61). Match PATH= / PATH+= /
# fish_add_path / zsh's path+=() (case-insensitive); the [^A-Za-z_] guard
# keeps PYTHONPATH=/MYPATH= out.
if grep -v '^[[:space:]]*#' "$rc" 2>/dev/null \
| grep -iE '(^|[^A-Za-z_])path[+]?=|fish_add_path' \
| grep -qF "$PREFIX"; then
state=present # rc already persists it — leave it alone
# Group the append so the redirection-open error (e.g. a read-only rc, or
# an unwritable parent dir) is suppressed too: `cmd >> "$rc" 2>/dev/null`
# leaks the shell's "Permission denied" because the >> open is attempted
# before 2>/dev/null applies. Wrapping in { ... } 2>/dev/null puts the
# stderr redirect in scope first.
elif { printf '\n# Added by the tracebloc CLI installer\n%s\n' "$path_line" >> "$rc"; } 2>/dev/null; then
state=added
fi

echo ""
case "$state" in
added)
echo "Added $PREFIX to your PATH in $rc."
echo "Open a new terminal — or load it now: . \"$rc\""
Comment thread
cursor[bot] marked this conversation as resolved.
;;
present)
echo "$PREFIX is already in your PATH config ($rc) — nothing to add."
echo "If a new terminal can't find it yet, open one — or load it now: . \"$rc\""
;;
*)
echo "Note: the installer couldn't update your shell config ($rc)."
echo "Add this line to it (or your shell's startup file), then open a new terminal:"
echo " $path_line"
;;
esac
echo ""
fi

echo "First steps:"
echo " $BINARY_NAME cluster info # confirm CLI can reach your cluster"
echo " $BINARY_NAME dataset push --help # see the dominant flow"
Loading