Issue Creating Clustered Table in BigQuery Using Spark Connector v0.33.0 in Direct Mode

Question

I’m using the BigQuery Spark connector (version 0.33.0) to create a clustered table in BigQuery, specifically in direct write mode. While the process completes without errors, the resulting table does not have the expected clustering configuration.

Here’s a brief overview of my approach:

df.write.format("bigquery") \ 
      .option("parentProject", parent_project) \
      .option("credentials", credentials) \
      .option("project", project) \
      .option("dataset", dataset) \
      .option("table", table_name) \
      .option("clusteredFields", clustered_fields) \
      .option("writeMethod", "direct") \ 
      .mode(mode).save()

I followed the guidelines in the official documentation, but it seems the clustering isn’t being applied as expected. I’m not sure what I’m missing or if there’s a specific version of the connector that’s required for clustering to work correctly.

Has anyone experienced a similar issue or has insights on this behavior?

Leave a Comment Cancel reply