pyspark.errors.exceptions.captured.analysisexception: org.apache.hadoop.hive.metastor.api InvalidObjectException : Unknown operator "!=" ❤️ {UPDATED 2023}

Using AWS-EMR 6.12.0, pyspark3.4.0 and python3.8

Longer error message:

    pyspark.errors.exceptions.captured.analysisexception: 
org.apache.hadoop.hive.metastor.api InvalidObjectException 
    : Unknown operator "!=" (Service:= AWSGlue: Status Code : 400)"

when happens when running df.write.mode("append").format("hive").partitionBy(*list_of_cols).saveAsTable("<database>.<table>)

Guess is that operation that is done to df contains ‘!=’ operation such as

df_b = df_z.groupBy(['key']).agg(F.count(F.when(F.col("col_a") != 1, F.col("col_b")).otherwise(F.lit(0)))
df = df_a.join(df_b, ['key'], 'left')

What are some ways I can replace ‘!=’ ?

There is nothing wrong with !=, but you have a missing closing parentheses here: df_b = df_z.groupBy(['key']).agg(F.count(F.when(F.col("col_a") != 1, F.col("col_b")).otherwise(F.lit(0))))

–

pyspark.errors.exceptions.captured.analysisexception: org.apache.hadoop.hive.metastor.api InvalidObjectException : Unknown operator “!=”

Leave a Comment Cancel reply