“inverse” value with Polars dataframe

Say, I have a dataset in the following DataFrame:

df=pl.DataFrame({
    'x':['a','a','b','b'],
    'y':['b','c','c','a'],
    'value':[3,5,1,4]
})
df
shape: (4, 3)
┌─────┬─────┬───────┐
│ x   ┆ y   ┆ value │
│ --- ┆ --- ┆ ---   │
│ str ┆ str ┆ i64   │
╞═════╪═════╪═══════╡
│ a   ┆ b   ┆ 3     │
│ a   ┆ c   ┆ 5     │
│ b   ┆ c   ┆ 1     │
│ b   ┆ a   ┆ 4     │
└─────┴─────┴───────┘

Now, I’d like to add a column to this dataframe that would contain the inverse
value. I define an inverse value as inverse(x, y) == value(y, x). E.g., from
the example above, inverse (a, b) == value(b, a) == 4.

If value(y, x) didn’t exist then inverse(x, y) would be given the default
value of 0.

In other words, I’d like to add an inverse column such as I’d end up with
something like this:

shape: (4, 4)
┌─────┬─────┬───────┬─────────┐
│ x   ┆ y   ┆ value ┆ inverse │
│ --- ┆ --- ┆ ---   ┆ ---     │
│ str ┆ str ┆ i64   ┆ i64     │
╞═════╪═════╪═══════╪═════════╡
│ a   ┆ b   ┆ 3     ┆ 4       │
│ a   ┆ c   ┆ 5     ┆ 0       │
│ b   ┆ c   ┆ 1     ┆ 0       │
│ b   ┆ a   ┆ 4     ┆ 3       │
└─────┴─────┴───────┴─────────┘

Is this doable in an easy and optimal way? Preferably with expressions?

Thanks a lot in advance.

You could join it to itself with aliases and then do fill_null(0).

df.join(
    df.select(
        y="x",
        x="y",
        inverse="value"
    ),
    on=["x","y"], how="left"
).fill_null(0)
shape: (4, 4)
┌─────┬─────┬───────┬─────────┐
│ x   ┆ y   ┆ value ┆ inverse │
│ --- ┆ --- ┆ ---   ┆ ---     │
│ str ┆ str ┆ i64   ┆ i64     │
╞═════╪═════╪═══════╪═════════╡
│ a   ┆ b   ┆ 3     ┆ 4       │
│ a   ┆ c   ┆ 5     ┆ 0       │
│ b   ┆ c   ┆ 1     ┆ 0       │
│ b   ┆ a   ┆ 4     ┆ 3       │
└─────┴─────┴───────┴─────────┘

Leave a Comment