Skip to content

Commit e39e90b

Browse files
committed
updates
1 parent d4281dc commit e39e90b

11 files changed

Lines changed: 115 additions & 678 deletions

File tree

README.md

Lines changed: 13 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -27,34 +27,28 @@
2727
https://github.com/user-attachments/assets/8ca57b68-4d7a-42cb-bcce-43f8b1681ce2 -->
2828

2929
<kbd>
30-
<img src="public/data-formulator-screenshot-v0.5.png">
30+
<img src="https://github.com/user-attachments/assets/3ffb15aa-93ce-42b8-92cf-aaf321f9a06a">
3131
</kbd>
3232

3333

3434
## News 🔥🔥🔥
35-
[01-31-2025] **uv support** — Faster installation with uv
36-
- 🚀 **Install with uv**: Data Formulator now supports installation via [uv](https://docs.astral.sh/uv/), the ultra-fast Python package manager. Get started in seconds with `uvx data_formulator` or `uv pip install data_formulator`.
3735

38-
[01-25-2025] **Data Formulator 0.6** — Real-time insights from live data
39-
-**Connect to live data**: Connect to URLs and databases with automatic refresh intervals. Visualizations update automatically as your data changes to provide you live insights. [Demo: track international space station position speed live](https://github.com/microsoft/data-formulator/releases/tag/0.6)
40-
- 🎨 **UI Updates**: Unified UI for data loading; direct drag-and-drop fields from the data table to update visualization designs.
41-
42-
[12-08-2025] **Data Formulator 0.5.1** — Connect more, visualize more, move faster
43-
- 🔌 **Community data loaders**: Google BigQuery, MySQL, Postgres, MongoDB
44-
- 📊 **New chart types**: US Map & Pie Chart (more to be added soon)
45-
- ✏️ **Editable reports**: Refine generated reports with [Chartifact](https://github.com/microsoft/chartifact) in markdown style. [demo](https://github.com/microsoft/data-formulator/pull/200#issue-3635408217)
46-
-**Snappier UI**: Noticeably faster interactions across the board
47-
48-
[11-07-2025] Data Formulator 0.5: Vibe with your data, in control
49-
50-
- 📊 **Load (almost) any data**: load structured data, extract data from screenshots, from messy text blocks, or connect to databases.
51-
- 🤖 **Explore data with AI agents**: Use agent mode for hands-off exploration, or stay in control in interactive mode.
52-
-**Verify AI generated results**: interact with charts and inspect data, formulas, explanations, and code.
53-
- 📝 **Create reports to share insights**: choose charts you want to share, and ask agents to create reports grounded in data formulated throughout exploration.
36+
[03-02-2026] **Data Formulator 0.7 (alpha)** — More charts, new experience, enterprise-ready
37+
- 📊 **30 visualization types**: Expanded from ~15 to 30 chart types with a new semantic chart engine — new types include area, streamgraph, bump, candlestick, density, lollipop, pie, rose, waterfall, strip plot, radar, US map, world map, and more. Semantic field analysis automatically recommends the right chart for the data.
38+
- 💬 **New experience — hybrid chat + data thread**: Chat-based interaction is woven directly into the exploration thread. Richer thread cards show transformation lineage, chart previews, and agent reasoning in a unified timeline.
39+
- 🤖 **Redesigned agent architecture**: A unified `DataAgent` replaces four separate agents. New recommendation agent suggests charts and exploration directions; new insight agent generates natural-language takeaways from chart results.
40+
- 🏗️ **Workspace / Data Lake architecture**: A persistent, identity-based workspace layer manages user data with a `workspace.yaml` metadata catalog. Supports local and Azure Blob backends for enterprise deployments.
41+
- 🔒 **Security hardening**: Code signing for generated code, sandboxed execution (local & Docker), authentication layer, and Flask rate limiting.
42+
- 📦 **UV-first build**: Fully reproducible builds with `uv.lock`; `uv sync` + `uv run data_formulator` is now the recommended development workflow.
43+
- 📝 A detailed writeup on the new architecture and design is coming soon — stay tuned!
5444

5545
## Previous Updates
5646

5747
Here are milestones that lead to the current design:
48+
- **v0.6** ([Demo](https://github.com/microsoft/data-formulator/releases/tag/0.6)): Real-time insights from live data — connect to URLs and databases with automatic refresh
49+
- **uv support**: Faster installation with [uv](https://docs.astral.sh/uv/)`uvx data_formulator` or `uv pip install data_formulator`
50+
- **v0.5.1** ([Demo](https://github.com/microsoft/data-formulator/pull/200#issue-3635408217)): Community data loaders, US Map & Pie Chart, editable reports, snappier UI
51+
- **v0.5**: Vibe with your data, in control — agent mode, data extraction, reports
5852
- **v0.2.2** ([Demo](https://github.com/microsoft/data-formulator/pull/176)): Goal-driven exploration with agent recommendations and performance improvements
5953
- **v0.2.1.3/4** ([Readme](https://github.com/microsoft/data-formulator/tree/main/py-src/data_formulator/data_loader) | [Demo](https://github.com/microsoft/data-formulator/pull/155)): External data loaders (MySQL, PostgreSQL, MSSQL, Azure Data Explorer, S3, Azure Blob)
6054
- **v0.2** ([Demos](https://github.com/microsoft/data-formulator/releases/tag/0.2)): Large data support with DuckDB integration
-824 KB
Binary file not shown.
-202 KB
Binary file not shown.
117 KB
Loading

py-src/data_formulator/agent_routes.py

Lines changed: 0 additions & 99 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,6 @@
3535
from data_formulator.agents.agent_report_gen import ReportGenAgent
3636
from data_formulator.agents.client_utils import Client
3737

38-
from data_formulator.workflows.exploration_flow import run_exploration_flow_streaming
3938
from data_formulator.agents.data_agent import DataAgent
4039

4140
# Get logger for this module (logging config done in app.py)
@@ -411,104 +410,6 @@ def derive_data():
411410
response.headers.add('Access-Control-Allow-Origin', '*')
412411
return response
413412

414-
@agent_bp.route('/explore-data-streaming', methods=['GET', 'POST'])
415-
def explore_data_streaming():
416-
def generate():
417-
if request.is_json:
418-
logger.setLevel(logging.INFO)
419-
420-
logger.info("# explore-data-streaming request")
421-
content = request.get_json()
422-
token = content["token"]
423-
424-
# each table is a dict with {"name": xxx, "rows": [...]}
425-
input_tables = content["input_tables"]
426-
initial_plan = content["initial_plan"] # The exploration question
427-
max_iterations = content.get("max_iterations", 3) # Number of exploration iterations
428-
max_repair_attempts = content.get("max_repair_attempts", 1)
429-
agent_exploration_rules = content.get("agent_exploration_rules", "")
430-
agent_coding_rules = content.get("agent_coding_rules", "")
431-
conversation_history = content.get("conversation_history", None)
432-
433-
logger.debug("== input tables ===>")
434-
for table in input_tables:
435-
logger.debug(f"===> Table: {table['name']} (first 5 rows)")
436-
logger.debug(table['rows'][:5])
437-
438-
logger.debug("== exploration question ===")
439-
logger.debug(initial_plan)
440-
441-
# Model config for the exploration flow
442-
model_config = {
443-
"endpoint": content['model']['endpoint'],
444-
"model": content['model']['model'],
445-
"api_key": content['model']['api_key'],
446-
"api_base": content['model'].get('api_base', ''),
447-
"api_version": content['model'].get('api_version', '')
448-
}
449-
450-
# Get identity for workspace (used for both SQL and Python with WorkspaceWithTempData)
451-
identity_id = get_identity_id()
452-
453-
try:
454-
for result in run_exploration_flow_streaming(
455-
model_config=model_config,
456-
input_tables=input_tables,
457-
initial_plan=initial_plan,
458-
identity_id=identity_id,
459-
max_iterations=max_iterations,
460-
max_repair_attempts=max_repair_attempts,
461-
agent_exploration_rules=agent_exploration_rules,
462-
agent_coding_rules=agent_coding_rules,
463-
conversation_history=conversation_history
464-
):
465-
response_data = {
466-
"token": token,
467-
"status": "ok",
468-
"result": result
469-
}
470-
471-
yield json.dumps(response_data) + '\n'
472-
473-
# Break if we get a completion result
474-
if result.get("type") == "completion":
475-
break
476-
477-
except Exception as e:
478-
logger.setLevel(logging.WARNING)
479-
logger.error(f"Error in exploration flow: {e}")
480-
logger.error(traceback.format_exc())
481-
error_data = {
482-
"token": token,
483-
"status": "error",
484-
"result": None,
485-
"error_message": str(e)
486-
}
487-
yield json.dumps(error_data) + '\n'
488-
489-
logger.setLevel(logging.WARNING)
490-
491-
else:
492-
error_data = {
493-
"token": "",
494-
"status": "error",
495-
"result": None,
496-
"error_message": "Invalid request format"
497-
}
498-
yield json.dumps(error_data) + '\n'
499-
500-
response = Response(
501-
stream_with_context(generate()),
502-
mimetype='application/json',
503-
headers={
504-
'Access-Control-Allow-Origin': '*',
505-
'Access-Control-Allow-Methods': 'GET, POST, OPTIONS',
506-
'Access-Control-Allow-Headers': 'Content-Type'
507-
}
508-
)
509-
return response
510-
511-
512413
@agent_bp.route('/data-agent-streaming', methods=['GET', 'POST'])
513414
def data_agent_streaming():
514415
"""Autonomous data exploration agent endpoint (SWE-agent style).

0 commit comments

Comments
 (0)