Documentation Index
Fetch the complete documentation index at: https://langchain-5e9cc07a-preview-mdrxyo-1777658790-7be347c.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
BigQuery Callback Handler
CommunityPythonPreview
Google BigQuery is a serverless and cost-effective enterprise data warehouse that works across clouds and scales with your data.The
BigQueryCallbackHandler allows you to log events from LangChain and LangGraph to Google BigQuery. This is useful for monitoring, auditing, and analyzing the performance of your LLM applications.
Key features:
- LangGraph support: automatic detection of LangGraph nodes with
AGENT_STARTING/AGENT_COMPLETEDevents and top-levelINVOCATION_STARTING/INVOCATION_COMPLETEDboundaries (vocabulary aligned with Google ADK’sBigQueryAgentAnalyticsPlugin) - Auto-created analytics views: one typed-column
CREATE OR REPLACE VIEWper event type (v_llm_response.usage_total_tokensinstead ofJSON_VALUE(...)) - Auto schema upgrade: additive
ALTER TABLE ADD COLUMNon existing tables, gated by a schema-version label so it runs at most once per version - Sub-agent attribution: the
agentBigQuery column is auto-derived fromlanggraph_nodewhen no explicitagentis set in metadata, so multi-agent graphs are tagged per sub-agent without any user changes - Rich LLM telemetry: token usage (
prompt_tokens/completion_tokens/total_tokens/cached_content_token_count),model_version, fullusage_metadata,cache_metadata, plusllm_config(temperature, top_p, …) andtoolsonLLM_REQUEST - Latency tracking: built-in latency measurement for all LLM and tool calls
- Event filtering: configurable allowlist / denylist plus an opt-in
skip_internal_chain_eventsheuristic that drops noisy framework chains (ChannelWrite,RunnableLambda, …) without breaking trace continuity - Graph context manager: explicit
INVOCATION_*boundaries with accurate timing flush()between requests: drain the queue without tearing the handler down- Real-time dashboard: FastAPI monitoring webapp with live event streaming
Preview releaseThe BigQuery Callback Handler is in Preview. APIs and functionality are subject to change.
For more information, see the
launch stage descriptions.
Installation
You need to installlangchain-google-community with bigquery extra dependencies. For this example, you will also need langchain-google-genai and langgraph.
Prerequisites
- Google Cloud Project with the BigQuery API enabled.
- BigQuery Dataset: Create a dataset to store logging tables before using the callback handler. The callback handler automatically creates the necessary events table within the dataset if the table does not exist.
- Google Cloud Storage Bucket (Optional): If you plan to log multimodal content (images, audio, etc.), creating a GCS bucket is recommended for offloading large files.
- Authentication:
- Local: Run
gcloud auth application-default login. - Cloud: Ensure your service account has the required permissions.
- Local: Run
IAM Permissions
For the callback handler to work properly, the principal (e.g., service account, user account) under which the application is running needs these Google Cloud roles:roles/bigquery.jobUserat Project Level to run BigQuery queries.roles/bigquery.dataEditorat Table Level to write log/event data.- If using GCS offloading:
roles/storage.objectCreatorandroles/storage.objectVieweron the target bucket.
Use with LangGraph agent
To use theBigQueryCallbackHandler with a LangGraph agent, instantiate it with your Google Cloud project ID and dataset ID. The handler creates the events table (and per-event-type analytics views) on first run. Use the graph_context() method to track top-level invocation boundaries — it emits INVOCATION_STARTING on enter and INVOCATION_COMPLETED (or INVOCATION_ERROR on exception) with accurate latency.
Pass session_id, user_id, and (optionally) agent via the metadata dictionary in the config object when invoking the agent. If agent is not set, the handler auto-derives it from metadata['langgraph_node'] so each sub-agent’s events are correctly attributed.
Configuration options
You can customize the callback handler usingBigQueryLoggerConfig.
To disable the handler from logging data to the BigQuery table, set this parameter to
False.The fields used to cluster the BigQuery table when it is automatically created.
The name of the GCS bucket to offload large content (images, blobs, large text) to. If not provided, large content may be truncated or replaced with placeholders.
The BigQuery connection ID (e.g.,
us.my-connection) to use as the authorizer for ObjectRef columns. Required for using ObjectRef with BigQuery ML.(500 KB) The maximum length (in characters) of text content to store inline in BigQuery before offloading to GCS (if configured) or truncating.
The number of events to batch before writing to BigQuery.
The maximum time (in seconds) to wait before flushing a partial batch.
Seconds to wait for logs to flush during shutdown.
A list of event types to log. If
None, all events are logged except those in event_denylist.A list of event types to skip logging.
Whether to log detailed content parts (including GCS references).
The default table ID to use if not explicitly provided to the callback handler constructor.
Configuration for retry logic (max retries, delay, multiplier) when writing to BigQuery fails.
The maximum number of events to hold in the internal buffer queue before dropping new events.
When
True, drop CHAIN_* events emitted by framework-internal Runnables (ChannelWrite, ChannelRead, Branch, RunnableLambda, RunnableSequence, RunnableParallel, RunnableAssign, RunnablePassthrough, RunnableBinding, Pregel, __start__, __end__). Skipped runs are still registered in the trace registry so child LLM/tool events keep the real graph root as their trace_id (no broken traces). Each suppression logs a DEBUG line so the heuristic is auditable.Static tags written to
attributes.custom_tags on every event row. Useful for slicing dashboards by deployment, cohort, or experiment (e.g. {"env": "prod", "agent_role": "sales"}).When
True, dumps the user-supplied RunnableConfig metadata (minus keys we already promote to first-class columns like session_id, user_id, agent, langgraph_node) under attributes.session_metadata.Optional
(raw_content, event_type) -> formatted hook invoked before content parsing. Useful for PII redaction or coercing custom payloads. Failures fall back to raw content with a warning — the formatter cannot break the agent.When
True, additively ALTER TABLE ADD COLUMN any new fields that future versions of this handler add to the events schema. Gated by a langchain_bq_schema_version table label so the diff runs at most once per schema version. Never drops, renames, or retypes columns.When
True, automatically CREATE OR REPLACE per-event-type analytics views beside the events table. Each view unnests the JSON columns into typed top-level columns (see Auto-created analytics views below).Prefix for auto-created view names (
v_llm_request, v_tool_completed, …). Set per-table when several handler instances share one dataset to avoid collisions.Schema and production setup
The plugin automatically creates the table if it does not exist. However, for production, we recommend creating the table manually using the following DDL, which utilizes the JSON type for flexibility and REPEATED RECORDs for multimodal content. Recommended DDL:Auto-created analytics views
When the handler creates the events table, it also creates oneCREATE OR REPLACE VIEW per event type beside it (controlled by
create_views, default True). Each view unnests the JSON columns into
typed top-level columns so analytics queries don’t have to spell
JSON_VALUE(...) every time:
view_prefix) and the typed
columns each one adds on top of the always-included columns:
| View | Extra typed columns |
|---|---|
v_invocation_starting | (none — only the always-included columns) |
v_invocation_completed / v_invocation_error | total_ms |
v_agent_starting | node_name, step |
v_agent_completed | node_name, step, total_ms |
v_agent_error | node_name, total_ms |
v_llm_request | model, request_content, llm_config, tools |
v_llm_response | response, usage_prompt_tokens, usage_completion_tokens, usage_total_tokens, usage_cached_tokens, context_cache_hit_rate, total_ms, ttft_ms, model_version, usage_metadata, cache_metadata |
v_llm_error | total_ms |
v_tool_starting | tool_name, tool_args |
v_tool_completed | tool_name, tool_result, total_ms |
v_tool_error | tool_name, total_ms |
v_retriever_start | query |
v_retriever_end / v_retriever_error | total_ms |
timestamp, event_type, agent, session_id, invocation_id,
user_id, trace_id, span_id, parent_span_id, status,
error_message, is_truncated) plus three columns lifted from the
attributes JSON: root_agent_name, custom_tags, session_metadata.
Auto schema upgrade
Existing tables are auto-upgraded additively when the handler’s schema gains new columns. The handler reads the table at startup and runsALTER TABLE ADD COLUMN for any new fields, gated by a
langchain_bq_schema_version table label so the diff runs at most once
per schema version. Never drops, renames, or retypes columns.
Disable with auto_schema_upgrade=False.
Sub-agent attribution
For multi-agent LangGraph deployments, theagent BigQuery column is
auto-derived from this fallback chain:
metadata["agent"]— explicit user-supplied value (highest priority)metadata["langgraph_node"]— the active LangGraph node, so each sub-agent’s events are tagged with the node namemetadata["checkpoint_ns"]— LangGraph checkpoint namespacehandler.graph_name— fallback for top-levelINVOCATION_*events
TheCritic, TheMeteo, …) thus
produces telemetry where each event is attributed to the originating
sub-agent without any user changes.
Event types and payloads
Thecontent column contains a JSON object specific to the event_type.
The content_parts column provides a structured view of the content, especially useful for images or offloaded data.
Content Truncation
- Variable content fields are truncated to
max_content_length(configured inBigQueryLoggerConfig, default 500KB). - If
gcs_bucket_nameis configured, large content is offloaded to GCS instead of being truncated, and a reference is stored incontent_parts.object_ref.
content always carries a summary keyEvery event row’s content JSON object includes a summary string with
a human-readable preview of the payload (capped at max_content_length).
The summary is omitted from the per-event tables below to keep the
shapes readable, but it is always present on disk.LLM interactions
These events track the raw requests sent to and responses received from the LLM.| Event Type | Content (JSON) Structure | Attributes (JSON) |
|---|---|---|
LLM_REQUEST | Chat Model: {"messages": [<dumped messages>]}Legacy Model: {"prompt": [<prompts>]} | {"tags": ["..."], "model": "...", "llm_config": {"temperature": 0.2, ...}, "tools": ["get_weather", ...]} |
LLM_RESPONSE | {"response": "<generation text>"} | {"usage": {"prompt_tokens": 100, "completion_tokens": 25, "total_tokens": 125}, "model_version": "gemini-2.5-flash-001", "usage_metadata": {"cached_content_token_count": 30, ...}, "cache_metadata": {...}} |
LLM_ERROR | {"data": null} (the actual exception text lives in the error_message column) | {} |
Sub-agent (LangGraph node) and invocation lifecycle
These events come from LangGraph’s node and graph-context lifecycle.agent is auto-derived from metadata['langgraph_node'] when no
explicit agent is set, so events are tagged per sub-agent.
| Event Type | Description |
|---|---|
AGENT_STARTING / AGENT_COMPLETED / AGENT_ERROR | A LangGraph node begins / ends / errors |
INVOCATION_STARTING / INVOCATION_COMPLETED / INVOCATION_ERROR | Top-level graph invocation begins / ends / errors (emitted by handler.graph_context()) |
Tool usage
These events track the execution of tools by the agent. The tool name is also surfaced inattributes.tool_name and (for the auto-views) as a
typed tool_name column.
| Event Type | Content (JSON) Structure |
|---|---|
TOOL_STARTING | {"tool": "<name>", "input": "<input string>"} — e.g. {"tool": "get_weather", "input": "city='Paris'"} |
TOOL_COMPLETED | {"tool": "<name>", "result": "<output string>"} — e.g. {"tool": "get_weather", "result": "25°C, Sunny"} |
TOOL_ERROR | {"data": null} (the actual exception text lives in the error_message column) |
Chain execution
These events fire for non-graph LangChainRunnable lifecycles (graph
invocations and LangGraph nodes use the INVOCATION_* / AGENT_* events
listed above instead).
| Event Type | Content (JSON) Structure |
|---|---|
CHAIN_START | {"data": "<JSON-stringified inputs>"} |
CHAIN_END | {"data": "<JSON-stringified outputs>"} |
CHAIN_ERROR | {"data": null} (see error_message column) |
Retriever usage
These events track the execution of retrievers.| Event Type | Content (JSON) Structure |
|---|---|
RETRIEVER_START | {"data": "<query string>"} — e.g. {"data": "What is the capital of France?"} |
RETRIEVER_END | {"data": "<JSON-stringified list of documents>"} |
RETRIEVER_ERROR | {"data": null} (the actual exception text lives in the error_message column) |
Agent Actions
These events come from legacy LangChainAgentExecutor-style agents
(on_agent_action / on_agent_finish). The data field contains a
JSON-serialized string of the action / finish payload.
| Event Type | Content (JSON) Structure |
|---|---|
AGENT_ACTION | {"data": "{\"tool\": \"Calculator\", \"input\": \"2 + 2\"}"} |
AGENT_FINISH | {"data": "{\"output\": \"The answer is 4\"}"} |
Other Events
| Event Type | Content (JSON) Structure |
|---|---|
TEXT | {"data": "<text>"} — e.g. {"data": "Some logging text..."} |
Advanced analysis queries
Once your agent is running and logging events, you can perform power analysis on theagent_events table.
1. Reconstruct a Trace (Conversation Turn)
Use thetrace_id to group all events (Chain, LLM, Tool) belonging to a single execution flow.
2. Analyze LLM Latency & Token Usage
Calculate the average latency and total token usage for your LLM calls.3. Analyze Multimodal Content with BigQuery Remote Model (Gemini)
If you are offloading images to GCS, you can use BigQuery ML to analyze them directly.4. Analyze Span Hierarchy & Duration
Visualize the execution flow and performance of your agent’s operations (LLM calls, Tool usage) using span IDs.5. Querying Offloaded Content (Get Signed URLs)
6. Advanced SQL Scenarios
These advanced patterns demonstrate how to sessionize data, analyze tool usage, and perform root cause analysis using BigQuery ML.Conversational Analytics in BigQuery
Conversational AnalyticsYou can also use BigQuery Conversational Analytics to analyze your agent logs using natural language.
Just ask questions like:
- “Show me the error rate over time”
- “What are the most common tool calls?”
- “Identify sessions with high token usage”
Looker Studio Dashboard
You can visualize your agent’s performance using our prebuilt Looker Studio Dashboard template. To connect this dashboard to your own BigQuery table, use the following link format, replacing the placeholders with your specific project, dataset, and table IDs:LangGraph integration
TheBigQueryCallbackHandler provides enhanced support for LangGraph agents with automatic node detection, graph-level tracking, and latency measurements.
LangGraph event types
In addition to standard LangChain events, the callback handler automatically detects and logs LangGraph-specific events:| Event Type | Description |
|---|---|
AGENT_STARTING | Emitted when a LangGraph node begins execution |
AGENT_COMPLETED | Emitted when a LangGraph node completes successfully |
AGENT_ERROR | Emitted when a LangGraph node fails |
INVOCATION_STARTING | Emitted when a graph execution begins (via context manager) |
INVOCATION_COMPLETED | Emitted when a graph execution completes |
INVOCATION_ERROR | Emitted when a graph execution fails |
Graph context manager
Use thegraph_context() method to explicitly mark graph execution boundaries. This enables INVOCATION_STARTING and INVOCATION_COMPLETED events with accurate latency measurements:
Latency tracking
The callback handler automatically tracks latency for all operations and stores measurements in thelatency_ms JSON column:
Event filtering
Useevent_allowlist and event_denylist to control which events are logged:
Examples and resources
Example code
The following examples demonstrate various features of the BigQuery callback handler:| Example | Description |
|---|---|
| Basic example | Basic callback usage with LLM calls |
| LangGraph agent | Complete ReAct agent with 6 realistic tools |
| Async example | Async handler with concurrent queries |
| Event filtering | Allowlist/denylist configurations |
| Sample data generator | Generate sample data across multiple agent types |
Analytics notebook
The LangGraph Agent Analytics notebook provides comprehensive BigQuery analytics queries for:- Real-time event monitoring
- Tool usage analytics
- Latency analysis and trends
- Error debugging
- User engagement metrics
- Time-series visualization
Real-time monitoring dashboard
A FastAPI-based monitoring dashboard is available for real-time agent monitoring: Features:- Live event stream via Server-Sent Events (SSE)
- Interactive charts for event distribution and latency trends
- Session tracing with detailed timeline view
- 20+ REST API endpoints for analytics queries
- Auto-refresh every 5 seconds
Feedback
We welcome your feedback on BigQuery Agent Analytics. If you have questions, suggestions, or encounter any issues, please reach out to the team at bqaa-feedback@google.com.Additional resources
Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

