Skip to main content
Plan restrictions applyBulk export is only available on LangSmith Plus or Enterprise tiers.
LangSmith can export trace data to a Google Cloud Storage (GCS) bucket in Parquet format. BigQuery can then query that data directly as an external table. Architecture: LangSmith → GCS (Parquet, Hive-partitioned) → BigQuery External Table This guide covers:
  • Setting up a GCS bucket and HMAC credentials for LangSmith
  • Creating a bulk export destination and export job
  • Creating a BigQuery external table over the exported data
  • Example queries and troubleshooting tips
For full details on bulk export configuration options, see Bulk export trace data and Manage bulk export destinations.

1. Create a GCS bucket

Create a dedicated GCS bucket for LangSmith exports. Using a dedicated bucket makes it easier to grant scoped permissions without affecting other data.
gcloud storage buckets create gs://YOUR_BUCKET_NAME \
  --location=US \
  --uniform-bucket-level-access
Choose a region close to your BigQuery dataset to minimize latency and avoid cross-region egress charges.

2. Create a service account and grant access

Create a GCP service account that LangSmith will use to write data to GCS:
gcloud iam service-accounts create langsmith-bulk-export \
  --display-name="LangSmith Bulk Export"
Grant the service account write access to your bucket. The minimum required permission is storage.objects.create. Granting storage.objects.delete is optional, but recommended. LangSmith uses it to clean up a temporary test file created during destination validation. If this permission is absent, a tmp/ folder may remain in your bucket. The “Storage Object Admin” predefined role covers all required and recommended permissions:
gcloud storage buckets add-iam-policy-binding gs://YOUR_BUCKET_NAME \
  --member="serviceAccount:langsmith-bulk-export@YOUR_PROJECT.iam.gserviceaccount.com" \
  --role="roles/storage.objectAdmin"
To use a minimal custom role instead, grant only:
  • storage.objects.create (required)
  • storage.objects.delete (optional, for test file cleanup)
  • storage.objects.get (optional but recommended, for file size verification)
  • storage.multipartUploads.create (optional but recommended, for large file uploads)

3. Generate HMAC keys

LangSmith connects to GCS using the S3-compatible XML API, which requires HMAC keys rather than a service account JSON key. Generate HMAC keys for your service account:
gcloud storage hmac keys create \
  langsmith-bulk-export@YOUR_PROJECT.iam.gserviceaccount.com
Save the accessId and secret from the output. You can also generate HMAC keys in the GCP Console under Cloud Storage → Settings → Interoperability → Create a key for a service account.

4. Create a bulk export destination

Create a destination in LangSmith pointing to your GCS bucket. Set endpoint_url to https://storage.googleapis.com to use the GCS S3-compatible API. You will need your LangSmith API key and workspace ID.
curl --request POST \
  --url 'https://api.smith.langchain.com/api/v1/bulk-exports/destinations' \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: YOUR_API_KEY' \
  --header 'X-Tenant-Id: YOUR_WORKSPACE_ID' \
  --data '{
    "destination_type": "s3",
    "display_name": "GCS for BigQuery",
    "config": {
      "bucket_name": "YOUR_BUCKET_NAME",
      "prefix": "YOUR_PREFIX",
      "endpoint_url": "https://storage.googleapis.com"
    },
    "credentials": {
      "access_key_id": "YOUR_HMAC_ACCESS_ID",
      "secret_access_key": "YOUR_HMAC_SECRET"
    }
  }'
prefix is a path within the bucket where LangSmith will write exported files. For example, langsmith-exports or data/traces. Choose any value that works for your bucket layout. LangSmith validates the credentials by performing a test write before saving the destination. If the request returns a 400 error, refer to Debug destination errors. Save the id from the response; you will need it in the next step.

Temporary validation file

During destination creation (and credential rotation), LangSmith writes a temporary .txt file to YOUR_PREFIX/tmp/ to verify write access, then attempts to delete it. The deletion is best-effort: if the service account lacks storage.objects.delete, the file is not deleted and the tmp/ folder remains in your bucket. The tmp/ folder is harmless and does not affect exports, but it will be included in broad GCS URI globs (e.g., gs://YOUR_BUCKET_NAME/YOUR_PREFIX/*). See Create a BigQuery external table for how to handle this when pointing BigQuery at your data.

5. Create a bulk export job

Create an export targeting a specific project. Use format_version: v2_beta for BigQuery compatibility—it produces UTC timezone-aware timestamps that BigQuery handles correctly. You will need the project ID (session_id), which you can copy from the project view in the Tracing Projects list. One-time export:
curl --request POST \
  --url 'https://api.smith.langchain.com/api/v1/bulk-exports' \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: YOUR_API_KEY' \
  --header 'X-Tenant-Id: YOUR_WORKSPACE_ID' \
  --data '{
    "bulk_export_destination_id": "YOUR_DESTINATION_ID",
    "session_id": "YOUR_PROJECT_ID",
    "start_time": "2024-01-01T00:00:00Z",
    "end_time": "2024-02-01T00:00:00Z",
    "format_version": "v2_beta",
    "compression": "SNAPPY"
  }'
Scheduled (recurring) export:
curl --request POST \
  --url 'https://api.smith.langchain.com/api/v1/bulk-exports' \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: YOUR_API_KEY' \
  --header 'X-Tenant-Id: YOUR_WORKSPACE_ID' \
  --data '{
    "bulk_export_destination_id": "YOUR_DESTINATION_ID",
    "session_id": "YOUR_PROJECT_ID",
    "start_time": "2024-01-01T00:00:00Z",
    "interval_hours": 24,
    "format_version": "v2_beta",
    "compression": "SNAPPY"
  }'
SNAPPY compression is fast and widely supported by BigQuery. See Bulk export trace data for all available options, including field filtering and filter expressions.

Output file structure

Exported files land in GCS using a Hive-partitioned path structure:
gs://YOUR_BUCKET_NAME/YOUR_PREFIX/export_id=<uuid>/tenant_id=<uuid>/session_id=<uuid>/resource=runs/year=<year>/month=<month>/day=<day>/<filename>.parquet
The partition columns in the path (export_id, tenant_id, session_id, resource, year, month, day) are available as queryable columns in BigQuery when Hive partition detection is enabled.

6. Create a BigQuery external table

Grant BigQuery access to GCS

BigQuery needs read access to your bucket. Find your BigQuery service account in GCP Console → BigQuery → Project Settings, then grant it access:
gcloud storage buckets add-iam-policy-binding gs://YOUR_BUCKET_NAME \
  --member="serviceAccount:YOUR_BQ_SERVICE_ACCOUNT@bigquery-encryption.iam.gserviceaccount.com" \
  --role="roles/storage.objectViewer"

Create the external table

Run the following in the BigQuery console or with bq:
CREATE OR REPLACE EXTERNAL TABLE `YOUR_PROJECT.YOUR_DATASET.langsmith_runs`
WITH PARTITION COLUMNS (
  export_id STRING,
  tenant_id STRING,
  session_id STRING,
  resource  STRING,
  year      INT64,
  month     INT64,
  day       INT64
)
OPTIONS (
  uris = ['gs://YOUR_BUCKET_NAME/YOUR_PREFIX/export_id=*'],
  format = 'PARQUET',
  hive_partition_uri_prefix = 'gs://YOUR_BUCKET_NAME/YOUR_PREFIX'
);
Why export_id=* instead of *LangSmith writes a temporary tmp/ folder to your prefix during destination creation and credential rotation (see Temporary validation file). Using export_id=* in the URI scopes BigQuery to only the Hive-partitioned export directories, avoiding any stray files under tmp/.If you have confirmed that your prefix contains only export data (e.g. you manually deleted the tmp/ folder), you can use * instead.
BigQuery will detect partition column names and types automatically from the path structure. You can also use WITH PARTITION COLUMNS without explicit column definitions to let BigQuery infer them:
CREATE OR REPLACE EXTERNAL TABLE `YOUR_PROJECT.YOUR_DATASET.langsmith_runs`
WITH PARTITION COLUMNS
OPTIONS (
  uris = ['gs://YOUR_BUCKET_NAME/YOUR_PREFIX/export_id=*'],
  format = 'PARQUET',
  hive_partition_uri_prefix = 'gs://YOUR_BUCKET_NAME/YOUR_PREFIX'
);

7. Query your data

Once your external table is set up, you can query it directly in BigQuery. For the full list of available columns, see Exportable fields.
Always filter on year, month, and/or day in your WHERE clause to enable partition pruning. Without these filters, BigQuery scans all files in the prefix.
Daily LLM cost and token usage:
SELECT
  DATE(start_time) AS date,
  SUM(total_cost)   AS total_cost_usd,
  SUM(total_tokens) AS total_tokens,
  COUNT(*)          AS run_count
FROM `YOUR_PROJECT.YOUR_DATASET.langsmith_runs`
WHERE year = 2024
  AND run_type = 'llm'
GROUP BY 1
ORDER BY 1;
Error rate by run name:
SELECT
  name,
  COUNTIF(status = 'error')                          AS errors,
  COUNT(*)                                            AS total,
  ROUND(COUNTIF(status = 'error') / COUNT(*) * 100, 2) AS error_rate_pct
FROM `YOUR_PROJECT.YOUR_DATASET.langsmith_runs`
WHERE year = 2024
GROUP BY name
HAVING total > 100
ORDER BY error_rate_pct DESC;
p50 and p95 latency for LLM runs:
SELECT
  name,
  APPROX_QUANTILES(TIMESTAMP_DIFF(end_time, start_time, MILLISECOND), 100)[OFFSET(50)] AS p50_ms,
  APPROX_QUANTILES(TIMESTAMP_DIFF(end_time, start_time, MILLISECOND), 100)[OFFSET(95)] AS p95_ms
FROM `YOUR_PROJECT.YOUR_DATASET.langsmith_runs`
WHERE session_id = 'YOUR_PROJECT_ID'
  AND run_type = 'llm'
GROUP BY name;

Credential rotation

To rotate your HMAC keys without interrupting active exports:
  1. Generate new HMAC keys in GCP for the same service account.
  2. Call the PATCH endpoint with the new credentials:
    curl --request PATCH \
      --url 'https://api.smith.langchain.com/api/v1/bulk-exports/destinations/YOUR_DESTINATION_ID' \
      --header 'Content-Type: application/json' \
      --header 'X-API-Key: YOUR_API_KEY' \
      --header 'X-Tenant-Id: YOUR_WORKSPACE_ID' \
      --data '{
        "credentials": {
          "access_key_id": "NEW_HMAC_ACCESS_ID",
          "secret_access_key": "NEW_HMAC_SECRET"
        }
      }'
    
    LangSmith validates the new credentials with a test write before saving. A new tmp/ file may appear in your bucket during this validation (see Temporary validation file).
  3. Keep old HMAC keys active until all in-flight export runs complete. Both credential sets are valid simultaneously during the transition window.
  4. Delete the old HMAC keys in GCP once you have confirmed no in-flight runs are using them.
For full details, see Rotate destination credentials.

Troubleshooting

SymptomLikely causeFix
400 Access denied on destination creationHMAC credentials lack write permissionVerify the service account has storage.objects.create on the bucket
400 Key ID you provided does not existHMAC access ID is invalidRegenerate HMAC keys in GCP
400 Invalid endpointEndpoint URL is malformedUse exactly https://storage.googleapis.com
BigQuery table shows no rowsExport not yet completeCheck export status with GET /api/v1/bulk-exports/{export_id}
BigQuery partition pruning not workingIncorrect hive_partition_uri_prefixEnsure the prefix ends at the directory level before the first partition key, e.g. gs://BUCKET/PREFIX
BigQuery picks up tmp/ filesBroad URI globUse export_id=* in your uris value instead of *
For additional error codes and export status details, see Monitor and troubleshoot bulk exports.