# NotebookLM Brain Interface — Design Spec

## Overview

A set of bash + Python scripts at `ops/brain/` that make the CMS-aiChemist NotebookLM notebook fully queryable and writable from the command line. This is the foundational interface between Claude Code (the hands) and NotebookLM (the brain).

## Core Principle

NotebookLM is the primary orchestration brain and persistent memory. Claude Code is an ephemeral executor. Every session: PULL context from NotebookLM → EXECUTE → PUSH learnings back. This interface formalizes that cycle into reusable, zero-dependency scripts.

## What Already Exists

- NotebookLM Enterprise API access confirmed (jonathan@noboxai.com, noboxAI project)
- CMS-aiChemist notebook created (ID: `7d8b1917-4466-4f33-b5cf-7a8fb283f160`)
- 4 sources uploaded (Agentic Frontend CMS Spec, Research - Dev Pipeline, Research - Penpot, Research - CMS Architecture)
- Query bridge prototype working (Gemini 2.5 Flash + source content as grounding context)
- gcloud SDK installed and authenticated
- `.env` configured with all credentials

## Architecture

### Data Flow

```
NotebookLM (Cloud Brain)                    Local Mirror
┌─────────────────────┐                    ┌──────────────────┐
│ Discovery Engine API │◄── brain-pull ───►│ handoffs/*.md    │
│ - Source CRUD        │    brain-push ───►│ .brain/inventory │
│ - Notebook metadata  │                   │ .brain/sessions  │
└─────────────────────┘                    └──────────────────┘
                                                    │
                                            brain-query reads
                                                    │
                                                    ▼
                                           ┌──────────────────┐
                                           │ Gemini 2.5 Flash │
                                           │ generateContent   │
                                           │ (grounded Q&A)    │
                                           └──────────────────┘
```

### Components

| Script | Purpose | Reads | Writes |
|--------|---------|-------|--------|
| `brain-query.sh` | Multi-source grounded Q&A via Gemini | `handoffs/*.md`, `.brain/inventory.json` | `.brain/sessions/*.log`, stdout |
| `brain-push.sh` | Upload new source to NotebookLM | stdin or file arg | NotebookLM API, `handoffs/`, `.brain/inventory.json`, `.brain/sessions/*.log` |
| `brain-pull.sh` | Sync source inventory from API | NotebookLM API | `.brain/inventory.json`, stdout |
| `brain-status.sh` | Notebook state report | `.brain/inventory.json`, `.env`, gcloud | stdout |
| `_brain_common.py` | Shared Python module | `.env` | (library, no direct output) |

## Interface Contracts

### brain-query.sh

```bash
# All sources as context (default)
brain-query.sh "What are the handoff stages?"

# Pipe question from stdin
echo "Summarize the Penpot integration model" | brain-query.sh

# Target specific source(s)
brain-query.sh --source "Agentic Frontend CMS Spec" "What agent roles are defined?"

# JSON output for programmatic use
brain-query.sh --format json "What systems are in the architecture?"

# Override model (default: gemini-2.5-flash)
brain-query.sh --model gemini-2.5-pro "Deep analysis of handoff gaps"
```

**Query flow**:
1. Load all source content from `handoffs/*.md` (or filtered by `--source`)
2. Construct grounding prompt: `"You are grounded in the following NotebookLM sources. Answer ONLY from this context.\n\nSOURCE 1: {title}\n{content}\n\nSOURCE 2: ..."`
3. Send to Gemini `generateContent` endpoint with `temperature: 0.1`
4. Return response to stdout
5. Log to `.brain/sessions/YYYY-MM-DD.log`

### brain-push.sh

```bash
# Push file as new source
brain-push.sh "Decision Log - 2026-04-06 - Brain Interface" ./docs/decision.md

# Push from stdin
echo "Session learned X, Y, Z" | brain-push.sh "Session Notes - 2026-04-06"
```

**Push flow**:
1. Read content from file arg or stdin
2. Save local mirror to `handoffs/` (slugified title as filename)
3. Call NotebookLM `sources:batchCreate` API with `textContent`
4. Update `.brain/inventory.json` with new source metadata
5. Log to `.brain/sessions/YYYY-MM-DD.log`

### brain-pull.sh

```bash
# Sync inventory
brain-pull.sh
# Output: "4 sources synced. 0 new sources detected."

# Show diff against local mirrors
brain-pull.sh --diff
# Output: lists sources in cloud but not locally, and vice versa
```

**Pull flow**:
1. Call NotebookLM notebook GET endpoint (returns full source list with metadata)
2. Write to `.brain/inventory.json`
3. Compare against files in `handoffs/`
4. Report new/missing/synced counts
5. Flag sources added via web UI that have no local mirror (API does not return content body — user must manually export or re-upload)

### brain-status.sh

```bash
brain-status.sh
# Notebook: CMS-aiChemist (7d8b1917...)
# Sources: 4 (9,095 words / 17,962 tokens)
# Last sync: 2026-04-06T06:01:11Z
# Auth: jonathan@noboxai.com (token valid)
# Local mirror: 4/4 synced
```

## Shared Python Module: _brain_common.py

Provides:
- `load_config()` — reads `.env` and `.brain/config.env`, returns dict with `GOOGLE_API_KEY`, `NOTEBOOKLM_API_BASE`, `NOTEBOOKLM_NOTEBOOK_ID`, `GCLOUD_PATH`
- `get_auth_token()` — calls `gcloud auth print-access-token`, returns string
- `notebooklm_request(method, path, body=None)` — makes authenticated request to Discovery Engine API
- `gemini_request(prompt, model="gemini-2.5-flash", temperature=0.1)` — makes API-key-authenticated request to Generative Language API
- `load_sources(filter_titles=None)` — reads `handoffs/*.md`, returns list of `{title, content}`
- `log_session(event_type, details)` — appends to daily session log

Uses only Python stdlib (`json`, `urllib.request`, `subprocess`, `os`, `pathlib`, `datetime`). No pip dependencies.

**Project root resolution**: `_brain_common.py` resolves the project root by walking up from its own location (`ops/brain/`) two levels. All paths (`.env`, `handoffs/`, `.brain/`) are relative to this root. Scripts can be called from any working directory.

## Error Handling

| Scenario | Behavior |
|----------|----------|
| gcloud not authenticated | Print: `"ERROR: gcloud not authenticated. Run: gcloud auth login --no-launch-browser"`, exit 1 |
| Missing .env / GOOGLE_API_KEY | Print: `"ERROR: GOOGLE_API_KEY not found. Check /home/jgatlit/apps/CMS/.env"`, exit 1 |
| Gemini 429 (rate limit) | Retry once after 5s. Fail with clear message on second attempt. |
| Sources exceed Gemini context window | Truncate oldest sources first. Warn which sources were trimmed. |
| Push API failure | Local mirror saved first (nothing lost). Error message includes local file path. |
| Inventory drift (cloud sources not mirrored locally) | brain-pull flags them. Does not auto-create content (API returns metadata only, not body). |
| Network failure | All API calls wrapped with timeout (30s) and clear error messages. |

## Session Logging

Every brain-query and brain-push appends to `.brain/sessions/YYYY-MM-DD.log`:

```
[2026-04-06T12:34:56] QUERY model=gemini-2.5-flash sources=4 tokens_in=18200 tokens_out=450
  Q: "What are the handoff stages?"
  A: (first 200 chars)...
[2026-04-06T12:35:10] PUSH title="Decision Log - Brain Interface" source_id=abc123 words=340
[2026-04-06T12:36:00] PULL sources=4 new=0 missing=0
[2026-04-06T12:36:05] STATUS sources=4 words=9095 tokens=17962 auth=valid mirror=4/4
```

## File Layout

```
ops/brain/
  brain-query.sh          # Multi-source grounded Q&A
  brain-push.sh           # Upload new source + local mirror
  brain-pull.sh           # Sync source inventory
  brain-status.sh         # Notebook state report
  _brain_common.py        # Shared: config, auth, API helpers
  README.md               # Usage reference
.brain/
  inventory.json          # Cached source metadata from API
  config.env              # Optional config overrides
  sessions/
    YYYY-MM-DD.log        # Daily session audit log
handoffs/
  *.md                    # Local mirrors of NotebookLM sources
```

## Dependencies

- Python 3.x (stdlib only — no pip install)
- gcloud CLI (already installed at `/home/jgatlit/google-cloud-sdk/bin/gcloud`)
- bash
- `.env` with `GOOGLE_API_KEY` and NotebookLM config (already configured)

## Success Criteria

1. `brain-query.sh "What are the 7 architecture principles?"` returns a correct, source-grounded answer
2. `brain-push.sh "Test Source" test.md` uploads to NotebookLM and creates local mirror
3. `brain-pull.sh` reports accurate inventory matching notebook state
4. `brain-status.sh` shows correct notebook metadata and auth status
5. All scripts work from any directory (use absolute paths from `.env`)
6. All scripts exit cleanly with informative errors on auth/config failures
7. Session log captures all operations for audit trail
