approve package
Submodules
approve.models module
- class approve.models.APPr(K: int, alpha: float, add_self_loops: bool = True, cached: bool = True, normalize: bool = True, **kwargs)[source]
Bases:
MessagePassingThe approximate personalized PageRank model for homogeneous graphs, adapted from the APPNP model:
\[ \begin{align}\begin{aligned}\mathbf{X}^{(0)} &= \mathbf{X}\,,\\\mathbf{X}^{(k)} &= (1 - \alpha) \mathbf{\hat{S}} \mathbf{X}^{(k-1)} + \alpha \mathbf{X}^{(0)}\,,\\\mathbf{X}' &= \mathbf{X}^{(K)}\,,\end{aligned}\end{align} \]where \(\alpha\) is the transport probability, \(\mathbf{\hat{S}}\) is the (column) stochastic matrix \(\mathbf{\hat{S}} := \mathbf{\hat{A}}\mathbf{\hat{D}}^{-1}\), \(\mathbf{\hat{A}}\) is the adjacency matrix with self-loops possibly added, and \(\mathbf{\hat{D}}\) is the diagonal degree matrix with entries \(\mathbf{\hat{D}}_{ii} := \sum_\ell \mathbf{\hat{A}}_{\ell i}\).
Note
The adjacency matrix can include values different from
1(representing edge weights) via the optionaledge_weighttensor.- Parameters:
K (int) – The number of iterations \(K\).
alpha (float) – The teleport probability \(\alpha\).
add_self_loops (bool, optional) – If set to
True, the model will add self-loops to the graph. (default:True)cached (bool, optional) – If set to
True, the model will cache the computation of \(\mathbf{\hat{A}} \mathbf{\hat{D}}\) on the first execution and use the cached version for further executions. (default:True)normalize (bool, optional) – If set to
True, the model will add self-loops to the graph and normalize the edge weights. (default:True)**kwargs (optional) – Additional arguments of
torch_geometric.nn.conv.MessagePassing.
Examples
from approve.models import APPr from torch import tensor # define homogeneous graph edge_index = torch.tensor([[1, 2, 2], [0, 0, 1]]) # assign uniform scores to each node x = torch.full((3, 1), 1 / 3) # compute PageRank model = APPr(K=30, alpha=0.5) model(x, edge_index) >>> tensor([[0.5333], [0.2667], [0.2000]])
- forward(x: Tensor, edge_index: Tensor, edge_weight: Tensor | None = None) Tensor[source]
Runs the forward pass of the module.
- Parameters:
x (Tensor) – The node features \(\mathbf{X}\).
edge_index (Tensor) – The edge indices.
edge_weight (Tensor, optional) – The edge weights. (default:
None)
- Returns:
The updated node features \(\mathbf{X}'\).
- Shape:
Input: \((|\mathcal{N}|, F)\), \((2,|\mathcal{E}|)\), and \((|\mathcal{E}|)\), where \(|\mathcal{N}|\) is the number of nodes, \(|\mathcal{E}|\) is the number of edges, and \(F\) is the number of features.
Output: \((|\mathcal{N}|, F)\).
- static message(x_j, edge_weight)[source]
Constructs messages from node \(j\) to node \(i\) in analogy to \(\phi_{\mathbf{\Theta}}\) for each edge in
edge_index. This function can take any argument as input which was initially passed topropagate(). Furthermore, tensors passed topropagate()can be mapped to the respective nodes \(i\) and \(j\) by appending_ior_jto the variable name, .e.g.x_iandx_j.
- class approve.models.HeteroAPPr(K: int, add_self_loops: bool = True, add_special_edges: bool = True, cached: bool = True, normalize: bool = True, check_node_type_conn: bool = True, **kwargs)[source]
Bases:
MessagePassingThe approximate personalized PageRank model for heterogeneous graphs which generalizes the APPNP model:
\[ \begin{align}\begin{aligned}\mathbf{X}^{[n](0)} &= \mathbf{X}^{[n]}\,,\\\mathbf{X}^{[n](k+1)} &= (1 - \alpha[n]) \textstyle \sum_{n' \in \mathcal{N}} \sum_{e \in \mathcal{E}[n',n]} \beta[e] \mathbf{\hat{S}}[e] \mathbf{X}^{[n'](k)} + \alpha[n] \mathbf{X}^{[n](0)}\,,\\\mathbf{X}^{\prime[n]} &= \mathbf{X}^{[n](K)}\,,\end{aligned}\end{align} \]where \(\alpha[n]\) is the transport probability for type \(n\) nodes, \(\mathcal{N}\) is the set of all node types, \(\mathcal{E}[n',n]\) is the set of all edge types \(e: n' \to n\), \(\beta[e]\) is the contribution percentage from type \(e\) edges for updating the features of type \(n\) nodes, \(\mathbf{\hat{S}}[e]\) is the (column) stochastic matrix \(\mathbf{\hat{S}}[e] := \mathbf{\hat{A}}[e]\mathbf{\hat{D}}[e]^{-1}\), \(\mathbf{\hat{A}}[e]\) is the adjacency matrix for type \(e\) edges with self-loops (for homogenenous layers) or special edges (for bipartite layers) possibly added, and \(\mathbf{\hat{D}}[e]\) is the diagonal degree matrix for type \(e\) edges with entries \(\mathbf{\hat{D}}[e]_{ii} := \sum_\ell \mathbf{\hat{A}}[e]_{\ell i}\).
Note
All adjacency matrices can include values other than
1(representing edge weights) via the optionaledge_weight_dictdictionary.- Parameters:
K (int) – The number of iterations \(K\).
add_self_loops (bool, optional) – If set to
True, the model will add self-loops to all homogeneous layers. (default:True)add_special_edges (bool, optional) – If set to
True, the model will add special edges to all bipartite layers. (default:True)cached (bool, optional) – If set to
True, the model will cache the computation of \(\mathbf{\hat{S}}[e]\) on the first execution and use the cached version for further executions. (default:True)normalize (bool, optional) – If set to
True, the model will add self-loops to homogeneous layers and special edges to bipartite layers, and normalize the edge weights. (default:True)check_node_type_conn (bool, optional) – If set to
True, the model checks that every node type is the target of at least one edge type. (default:True)**kwargs (optional) – Additional arguments of
torch_geometric.nn.conv.MessagePassing
Examples
from approve.models import HeteroAPPr from torch import tensor from torch_geometric.data import HeteroData # define heterogeneous graph hetero_data = HeteroData() hetero_data['paper', 'cites', 'paper'].edge_index = \ torch.tensor([[1, 2, 2], [0, 0, 1]]) hetero_data['venue', 'publishes', 'paper'].edge_index = \ torch.tensor([[0, 1], [0, 1]]) hetero_data['paper', 'rev_publishes', 'venue'].edge_index = \ hetero_data['venue', 'publishes', 'paper'].edge_index[[1,0]] # assign uniform scores to each node hetero_data['paper'].x = torch.full((3, 1), 1 / 3) hetero_data['venue'].x = torch.full((2, 1), 1 / 2) # compute PageRank model = HeteroAPPr(K=30) model(hetero_data.x_dict, hetero_data.edge_index_dict) >>> {'paper': tensor([[0.4605], [0.3289], [0.2105]]), 'venue': tensor([[0.4803], [0.4145], [0.1053]])}
- forward(x_dict: Dict[str, Tensor], edge_index_dict: Dict[Tuple[str, str, str], Tensor], edge_weight_dict: Dict[Tuple[str, str, str], Tensor] | None = None, special_dict: Dict[str, int | None] | None = None, alpha_exp_dict: Dict[str, float] | None = None, beta_exp_dict: Dict[Tuple[str, str, str], float] | None = None) Dict[str, Tensor][source]
Runs the forward pass of the module.
- Parameters:
x_dict (Dict[str, Tensor]) – A dictionary of features for each node type.
edge_index_dict (Dict[Tuple[str, str, str], Tensor]) – A dictionary of edge indices for each edge type.
edge_weight_dict (Dict[Tuple[str, str, str], Tensor], optional) – A dictionary of edge weights for each edge type. (default:
None)special_dict (Dict[str, int] , optional) – A dictionary of special nodes for each node type. (default:
None)alpha_exp_dict (Dict[str, float], optional) – A dictionary of exponents used to calculate transport probabilities via
utils.gen_alpha_dictfor each node type. (default:None)beta_exp_dict (Dict[Tuple[str, str, str], float], optional) – A dictionary of exponents used to calculate contribution percentages via
utils.gen_beta_dictfor each edge type. (default:None)
- Returns:
An updated dictionary of features for each node type.
- Raises:
AssertionError – If
self.check_node_type_connis set toTrueand not every node type is the target of at least one edge type.
- static message(x_j, edge_weight)[source]
Constructs messages from node \(j\) to node \(i\) in analogy to \(\phi_{\mathbf{\Theta}}\) for each edge in
edge_index. This function can take any argument as input which was initially passed topropagate(). Furthermore, tensors passed topropagate()can be mapped to the respective nodes \(i\) and \(j\) by appending_ior_jto the variable name, .e.g.x_iandx_j.
approve.typing module
approve.utils module
- approve.utils.add_remaining_special_edges(edge_index: Tensor, edge_weight: Tensor | None = None, num_nodes_s: int | None = None, num_nodes_t: int | None = None, special_s: int | None = None, special_t: int | None = None, fill_value: float = 1.0) Tuple[Tensor, Tensor | None, int | None, int | None, int | None, int | None][source]
Adds special edges, from missing source nodes to
special_t, and fromspecial_sto missing target nodes, for all nodes missing fromedge_index. Ifspecial_s(resp.special_t) is not given, then it is set tonum_nodes_s(resp.num_nodes_t).- Parameters:
edge_index (Tensor) – The edge indices.
edge_weight (Tensor, optional) – The edge weights. (default:
None)num_nodes_s (int, optional) – The number of source nodes, if known. (default:
None)num_nodes_t (int, optional) – The number of target nodes, if known. (default:
None)special_s (int, optional) – The special source node, if set. (default:
None)special_t (int, optional) – The special target node, if set. (default:
None)fill_value (float, optional) – The weight associated with special edges. This is relevant only when
edge_weightis notNone. (default:1.)
- Returns:
The updated values for
edge_weight,edge_weight,num_nodes_s,num_nodes_t,special_s, andspecial_t.
- Shape:
Input: \((2,|\mathcal{E}|)\) and \((|\mathcal{E}|)\) where \(|\mathcal{E}|\) is the number of edges.
Output: \((2,|\mathcal{E}'|)\) and \((|\mathcal{E}'|)\) where \(|\mathcal{E}'|\) is the new number of edges.
Examples
>>> edge_index = torch.tensor([[0, 0], ... [0, 1]]) >>> add_remaining_special_edges(edge_index, num_nodes_s=2, ... num_nodes_t=3) (tensor([[0, 0, 1, 2], [0, 1, 3, 2]]), None, 3, 4, 2, 3)
- approve.utils.bipartite_maybe_num_nodes(edge_index: Tensor, num_nodes_s: int | None = None, num_nodes_t: int | None = None) Tuple[int, int][source]
Calculates the number of source and target nodes in the bipartite graph with edge indices given by
edge_index.- Parameters:
edge_index (Tensor) – The edge indices.
num_nodes_s (int, optional) – The number of source nodes, if known. (default:
None)num_nodes_t (int, optional) – The number of target nodes, if known. (default:
None)
- Returns:
The updated values for
num_nodes_sandnum_nodes_t.- Raises:
AssertionError – If
num_nodes_sis notNoneand not more than the maximum value ofedge_index[0].AssertionError – If
num_nodes_tis notNoneand not more than the maximum value ofedge_index[1].
- Shape:
Input: \((2,|\mathcal{E}|)\) where \(|\mathcal{E}|\) is the number of edges.
Examples
>>> edge_index = torch.tensor([[0, 0], ... [0, 1]]) >>> bipartite_maybe_num_nodes(edge_index) (1, 2)
- approve.utils.bipartite_pr_norm(edge_index: Tensor, edge_weight: Tensor | None = None, num_nodes_s: int | None = None, num_nodes_t: int | None = None, add_special_edges: bool = True, special_s: int | None = None, special_t: int | None = None, flow: str = 'source_to_target') Tuple[Tensor, Tensor | None, int, int, int | None, int | None][source]
Calculates the entries of the (column) stochastic matrix \(\mathbf{\hat{S}} := \mathbf{\hat{A}}\mathbf{\hat{D}}^{-1}\) as edge weights where \(\mathbf{\hat{A}}\) is the adjacency matrix with special edges possibly added, and \(\mathbf{\hat{D}}\) is the diagonal degree matrix with entries \(\mathbf{\hat{D}}_{ii} := \sum_\ell \mathbf{\hat{A}}_{\ell i}\).
- Parameters:
edge_index (Tensor) – The edge indices.
edge_weight (Tensor, optional) – The edge weights. (default:
None)num_nodes_s (int, optional) – The number of source nodes, if known. (default:
None)num_nodes_t (int, optional) – The number of target nodes, if known. (default:
None)add_special_edges (bool, optional) – If set to
True, the function will add special edges to the bipartite graph. (default:True)special_s (int, optional) – The special source node, if set. (default:
None)special_t (int, optional) – The special target node, if set. (default:
None)flow (str, optional) – The flow of the graph. (default:
"source_to_target")
- Returns:
The updated values for
edge_index,edge_weight,num_nodes_s,num_nodes_t,special_s, andspecial_t.- Raises:
AssertionError – If
flowis neither"source_to_target"nor"target_to_source".
- approve.utils.gen_alpha_dict(node_type_list: List[str], alpha_exp_dict: Dict[str, float] | None = None) Dict[str, float][source]
Calculates the transport probabilities \(\alpha[n]\) for each node type \(n\) in
node_type_listvia\[\alpha[n] = \frac{\exp{A[n]}}{1 + \exp{A[n]}}\,.\]Note
If the exponent \(A[n]\) is not specified in
alpha_exp_dict, then it defaults to0.- Parameters:
node_type_list (List[str]) – A list of all node types.
alpha_exp_dict (Dict[str, float], optional) – A dictionary of exponents \(A[n]\) used to calculate the transport probabilities \(\alpha[n]\) for each node type.
- Returns:
A dictionary of transport probabilities \(\alpha[n]\) for each node type.
- approve.utils.gen_beta_dict(edge_type_list: List[Tuple[str, str, str]], beta_exp_dict: Dict[Tuple[str, str, str], float] | None = None) Dict[Tuple[str, str, str], float][source]
Calculates the contribution percentages \(\beta[e]\) for each edge type \(e:n' \to n\) in
beta_exp_listvia\[\beta[e] = \frac{\exp{B[e]}}{\sum_{n'' \in \mathcal{N}} \sum_{e'\in\mathcal{E}[n'',n]} \exp{B[e']}}\,,\]where the sum is over all edge types with type \(n\) nodes as targets.
Note
If the exponent \(B[e]\) is not specified in
beta_exp_dict, then it defaults to0.- Parameters:
edge_type_list (List[Tuple[str, str, str]]) – A list of all edge types.
beta_exp_dict (Dict[Tuple[str, str, str], float], optional) – A dictionary of exponents \(B[e]\) used to calculate the contribution percentages \(\beta[e]\) for each edge type.
- Returns:
A dictionary of contribution percentages \(\beta[e]\) for each edge type.
- approve.utils.hetero_pr_norm(edge_index_dict: Dict[Tuple[str, str, str], Tensor], edge_weight_dict: Dict[Tuple[str, str, str], Tensor] | None = None, num_nodes_dict: Dict[str, int] | None = None, add_self_loops: bool = True, add_special_edges: bool = True, special_dict: Dict[str, int | None] | None = None, flow: str = 'source_to_target') Tuple[Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Tensor], Dict[str, int], Dict[str, int | None]][source]
Calculates the entries of the (column) stochastic matrix \(\mathbf{\hat{S}}[e] := \mathbf{\hat{A}}[e]\mathbf{\hat{D}}[e]^{-1}\) as edge weights for all edge types \(e\) where \(\mathbf{\hat{A}}[e]\) is the adjacency matrix for type \(e\) edges with self-loops (for homogenenous layers) or special edges (for bipartite layers) possibly added, and \(\mathbf{\hat{D}}[e]\) is the diagonal degree matrix for type \(e\) edges with entries \(\mathbf{\hat{D}}[e]_{ii} := \sum_\ell \mathbf{\hat{A}}[e]_{\ell i}\).
- Parameters:
edge_index_dict (Dict[Tuple[str, str, str], Tensor]) – A dictionary of edge indices for each edge type.
edge_weight_dict (Dict[Tuple[str, str, str], Tensor], optional) – A dictionary of edge weights for each edge type. (default:
None)num_nodes_dict (Dict[str, Tensor], optional) – A dictionary for the number of each node type, if known. (default:
None)add_self_loops (bool, optional) – If set to
True, the function will add self-loops to all homogeneous layers. (default:True)add_special_edges (bool, optional) – If set to
True, the function will add special edges to all bipartite layers. (default:True)special_dict (Dict[str], Tensor], optional) – A dictionary of special nodes for each node type, if set. (default:
None)flow (str, optional) – The flow of the graph. (default:
"source_to_target")
- Returns:
The updated values for
edge_index_dict,edge_weight_dict,num_nodes_dict, andspecial_dict.- Raises:
AssertionError – If
flowis neither"source_to_target"nor"target_to_source".
- approve.utils.missing_indices(index: Tensor, num_nodes: int | None = None) Tensor | None[source]
Finds all integers in the interval \([0,N)\) missing from
index. Ifnum_nodesis notNone, \(N\) isnum_nodes. Otherwsie, \(N\) is the maximum value ofindexplus one.- Parameters:
index (torch.Tensor) – A one-dimensional tensor of non-negative integers representing nodes.
num_nodes (int, optional) – The number of nodes, if known. (default:
None)
- Returns:
A one-dimensional tensor of all missing indices or
Noneif no indices are missing.- Raises:
AssertionError – If
num_nodesis notNoneand not more than the maximum value ofindex.
Examples
>>> index = torch.tensor([0, 2]) >>> missing_index(index) tensor([1])
- approve.utils.pr_norm(edge_index: Tensor, edge_weight: Tensor | None = None, num_nodes: int | None = None, add_self_loops: bool = True, flow: str = 'source_to_target') Tuple[Tensor, Tensor][source]
Calculates the entries of the (column) stochastic matrix \(\mathbf{\hat{S}} := \mathbf{\hat{A}}\mathbf{\hat{D}}^{-1}\) as edge weights where \(\mathbf{\hat{A}}\) is the adjacency matrix with self-loops possibly added, and \(\mathbf{\hat{D}}\) is the diagonal degree matrix with entries \(\mathbf{\hat{D}}_{ii} := \sum_\ell \mathbf{\hat{A}}_{\ell i}\).
- Parameters:
edge_index (Tensor) – The edge indices.
edge_weight (Tensor, optional) – The edge weights. (default:
None)num_nodes (int, optional) – The number of nodes, if known. (default:
None)add_self_loops (bool, optional) – If set to
True, the function will add self-loops to the graph. (default:True)flow (str, optional) – The flow of the graph. (default:
"source_to_target")
- Returns:
The updated values for
edge_indexandedge_weight.- Raises:
AssertionError – If
flowis neither"source_to_target"nor"target_to_source".
- Shape:
Input: \((2,|\mathcal{E}|)\) and \((|\mathcal{E}|)\) where \(|\mathcal{E}|\) is the number of edges.
Output: \((2,|\mathcal{E}'|)\) and \((|\mathcal{E}'|)\) where \(|\mathcal{E}'|\) is the new number of edges.