Why
Graph無處不在
Graph Intelligence helps
It's the right time now!
Gartner預測篓冲,graph技術(shù)在數(shù)據(jù)和分析創(chuàng)新中的使用率從2021年的10%,到2025年會增長到80%嗤攻。我們目前正在經(jīng)歷從early adoption到early mainstream的穿越大峽谷的時期,既不太早也不太晚妇菱,時間剛剛好暴区。
What
如何建模圖
A graph ?? is an ordered pair ?? = (??, ??) comprising:
- ??, a set of vertices (or nodes)
???{(??,??)|??,??∈??}, a set of edges (or links), which are pairs of nodes
Example:
directed graph
Different Types of Graph
- Are edges directed?
Directed Graph vs. Undirected Graph - Are there multiple types of nodes or multiple types of edges?
Homogeneous Graph vs Heterogeneous Graph
如何表示圖
不同的表示方式會指向不同的計算模式闯团。
如何計算圖
如下圖所示,圖的計算步驟如下:
- 遍歷圖中的所有結(jié)點颜启,或者采樣圖中的一些結(jié)點偷俭。每次選擇其中一個結(jié)點,稱為目標結(jié)點(target node);
- 一個??-層的GNN至少需要聚合目標結(jié)點的L-跳領域的信息缰盏。因此涌萤,我們需要以圍繞目標結(jié)點構(gòu)造一個L-跳的ego network淹遵。圖中是一個2-跳ego network的例子,其中綠色結(jié)點是第1跳负溪,藍色結(jié)點是第2跳透揣;
-
計算并更新ego-network里的每個結(jié)點的embedding。embeddings會使用到圖的結(jié)構(gòu)信息和結(jié)點與邊的特征川抡。
ego network
那么辐真,這些embedding是如何計算和更新的呢?主要是使用Message Passing的計算方法崖堤。Message Passing有一些計算范式如GAS(Gather-ApplyEdge-Scatter), SAGA(Scatter-ApplyEdge-Gather-ApplyVertex)等侍咱。我們這里介紹歸納得比較全面的SAGA計算范式。假設需要計算和更新下圖中的:
-
Scatter
Propagate message from source vertex to edge.
scatter -
ApplyEdge
Transform message along each edge.
ApplyEdge -
Gather
Gather transformed message to the destination vertex.
Gather -
ApplyVertex
Transform the gathered output to get updated vertex.
ApplyVertex
公式如下:
分析一下密幔,會發(fā)現(xiàn)楔脯,SAGA模式中ApplyEdge和ApplyVertex是傳統(tǒng)deep learning中的NN(Neural Network)操作,我們可以復用胯甩;而Scatter和Gather是GNN新引入的操作昧廷。即,Graph Computing = Graph Ops + NN Ops偎箫。
不同的圖數(shù)據(jù)集規(guī)模
-
One big graph
one big graph -
Many small graphs
many small graphs
不同的圖任務
-
Node-level prediction
預測圖中結(jié)點的類別或性質(zhì)
Node-level Prediction -
Edge-level prediction
預測圖中兩個結(jié)點是否存在邊眉枕,以及邊的類別或性質(zhì)
Edge-level Prediction -
Graph-level prediction
預測整圖或子圖的類別或性質(zhì)
Graph-level Prediction
How
Workflow
以fraud detection為例:
Tabformer數(shù)據(jù)集
Tabformer dataworkflow
fraud detection workflow
軟件棧
-
計算平面
compute plane -
數(shù)據(jù)平面
data plane
SW Challenges
Graph Sampler
For many small graphs
datasets, full batch training works most time. Full batch training means we can do training on whole graph;
When it comes to one large graph
datasets, in many real scenarios, we meet Neighbor Explosion
problem;
Neighbor Explosion:
neighboe explosion
Graph sampler comes to rescue. Only sample a fraction of target nodes, and furthermore, for each target node, we sample a sub-graph of its ego-network for training. This is called mini-batch training.
Graph sampling is triggered for each data loading. And the hops of the sampled graph equals the GNN layer number ??. Which means graph sampler in data loader is important in GNN training.
Challenge: How to optimize sampler both as standalone and in training pipe?
When graph comes to huge(billions of nodes, tens of billions of edges), we meet new at-scale challenges:
- How to store the huge graph across node? -> graph partition
- How to build a training system w/ not only distributed model computing but also distributed graph store and sampling?
-
How to cut the graph while minimize cross partition connections?
graph partition
-
A possible GNN distributed training architecture:
Scatter-Gather
- Fuse adjacent graphs ops
One common fuse pattern for GCN & GraphSAGE:
Aggregate
Challenge:
How to fuse more GNN patterns on different ApplyEdge and ApplyVertex,automatically?
-
How to implement fused Aggregate
Aggregate implementations
Challenge:
- Different graph data structures lead to different implementations in same logic operations;
- Different graph characteristics favors different data structures;(like low-degree graphs favor COO, high-degree graphs favor CSR)
- **How to find the applicable zone for each and hide such complexity to data scientists? **
More
- Inference challenge
- GNN inference needs full batch inference, how to make it efficient?
- Distributed inference for big graph?
- Vector quantization for node and edge features?
- GNN distilled to MLP?
- SW-HW co-design challenge
- How to relief irregular memory access in scatter-gather?
- Do we need some data flow engine for acceleration?
- …
Finishing words
"There is plenty of room at the top" 對技術(shù)人員很重要流纹。但為避免入寶山而空返漱凝,我們更需要建立起技術(shù)架構(gòu)茸炒,這就像是地圖一樣阵苇,只有按圖索驥才能更好地探索和利用好top里的plenty of room绅项。
References
- Graph + AI: What’s Next? Progress in Democratizing Graph for All
- Recent Advances in Efficient and Scalable Graph Neural Networks
- Crossing the Chasm – Technology adoption lifecycle
- Understanding and Bridging the Gaps in Current GNN Performance Optimizations
- Automatic Generation of High-Performance Inference Kernels for Graph Neural Networks on Multi-Core Systems
- Understanding GNN Computational Graph: A Coordinated Computation, IO, And Memory Perspective
- Graphiler: A Compiler For Graph Neural Networks
- Scatter-Add in Data Parallel Architectures
- fuseGNN: Accelerating Graph Convolutional Neural Network Training on GPGPU
- VQ-GNN: A Universal Framework to Scale up Graph Neural Networks using Vector Quantization
- NeuGraph: Parallel Deep Neural Network Computation on Large Graphs
- Completing a member knowledge graph with Graph Neural Networks
- PinnerFormer: Sequence Modeling for User Representation at Pinterest
- Gartner and Graph Analytics