I am trying to calculate a pagerank for a buyer/seller network. A buyer could be seller too, meaning A could sell $100 worth of stuff to B, and B could sell $20 worth of (other) stuff to A. So I am using DiGraph for the network, and the weight is the $ value.
My question is, the below two scripts, either with or without edge_attr, yield the exact same pagerank values.
So am I missing anything here?
Thank you very much for your time.
import pandas as pd
import networkx as nx
df =pd.DataFrame({'Seller':['A1','B1', 'A2','A2','B2', 'B2'],
'Buyer':['B1','A1','B1','B2','A2', 'B2'],
'Value':[10,20,30,40,50, 5]})
g1 = nx.from_pandas_edgelist(df, 'Seller', "Buyer", create_using=nx.DiGraph())
pagerank_g1 = nx.pagerank(g1)
pagerank_g1=sorted(pagerank_g1.items(),key=lambda v:(v[1],v[0]),reverse=True)
print(' no weight, pagerank_g1', pagerank_g1)
g2=nx.from_pandas_edgelist(df, 'Seller', "Buyer", create_using=nx.DiGraph(), edge_attr="Value")
pagerank_g2 = nx.pagerank(g2)
pagerank_g2=sorted(pagerank_g2.items(),key=lambda v:(v[1],v[0]),reverse=True)
print('with weight, pagerank_g2', pagerank_g2) }
That’s because the default weight
name in pagerank
is weight
while yours is Value
.
pagerank
(G, alpha=0.85, personalization=None, max_iter=100, tol=1e-06,
nstart=None, weight=”weight”, dangling=None)
So either use :
pagerank_g2 = nx.pagerank(g2, weight="Value")
# instead of `pagerank_g2 = nx.pagerank(g2)`
Or rename the column and the edge attribute’s name :
df.rename(columns={"Value": "weight"}, inplace=True)
# ...
g2 = nx.from_pandas_edgelist(df, ..., edge_attr="weight")
Output :
#No weight #With weight
[ [
('B1', 0.3956298463938973), ('B1', 0.40247095482812456),
('A1', 0.3737837827630368), ('A1', 0.37960199890637514),
('B2', 0.13549920760890705), ('A2', 0.11614768683390858),
('A2', 0.09508716323415882) ('B2', 0.10177935943159168)
] ]