Please report any encountered problems here.
The presentation contains animations that require any of the following PDF viewers:
Acrobat Reader (version >= 7), PDFXChange or Foxit Reader
Table of contents
Supplementary material are organized in the following way. Abstract
 Two major contributions
 Interface (conditions & hyperparameters)
 SGtSNE for Embedding of Sparse Stochastic Graphs/Matrices
 Performance
 Supplementary: relating the adjacency matrix to the graph embedding matrix in SGtSNE
 Precursor works
 Data sources
 Acknowledgments
Abstract
We introduce a nonlinear method for directly embedding large, sparse, stochastic graphs into lowdimensional spaces, without requiring vertex features to reside in or be transformed into, a metric space. Graph data and models are prevalent in realworld applications. Direct graph embedding is fundamental to many graph analysis tasks, in addition to graph visualization. We name the novel approach SGtSNE, as it is inspired by and builds upon the core principle of, a widely used method for nonlinear dimensionality reduction and data visualization. We also introduce tSNEΠ, a highperformance software for 2D, 3D embedding of large sparse graphs on personal computers with superior efficiency. It empowers with modern computing techniques for exploiting in tandem both matrix structures and memory architectures. We present elucidating embedding results on two synthetic graphs and six realworld networks (one synthetic and four realworld networks are shown in the article).
Two major contributions
Method SGtSNE: Enabling sparse stochastic graph (SG) embedding
We introduce a novel nonlinear approach for directly embedding large, sparse, stochastic graphs into lowdimensional spaces, without requiring vertex features to reside in, or be transformed into, a metric space. There are a lot of stochastic graphs, in realworld applications, without feature vectors readily in a metric space or available for public use. Our approach, named SGtSNE, is inspired by and builds upon the core principle of tSNE, which is widely used for nonlinear dimensionality reduction and data visualization. There are many variants of tSNE. In SGtSNE, by a radical removal of the existing restrictions, we extend the use of tSNE to the entire regime of stochastic graphs, no longer limited to distancebased \(k\)nearest neighbor (\(k\)NN) graphs. SGtSNE is also equipped with a parameterized nonlinear rescaling mechanism. It explores the sparsity to a much greater scope. In the case of a \(k\)NN graph, SGtSNE is capable of rendering much better embedding results than conventional tSNE.
Software tSNEΠ: Empowering SGtSNE with highperformance algorithms and implementations
We introduce a software tSNEΠ. It accelerates 2D graph embedding up to \(5\times\) in execution time in comparison to the best among the existing tSNE variants. More importantly, it enables 3D embedding of large sparse graphs on personal computers, with superior efficiency. We demonstrate that 3D embedding has greater capacity of preserving local and structural connectivity. Steerable visualization of all 3D embedding results is available. We present also experimental results on two biological networks, a social network created by Google, and an Amazon product network. The embedding results are characteristic of each and elucidating. They also manifest multiple advantages of the novel method SGtSNE with highperformance support by tSNEΠ.
Interface (conditions & hyperparameters)
Precursor algorithms
tSNE
tSNE web page by van der Maaten
The input to the conventional tSNE algorithm is a highdimensional dataset \(\mathbf{X} \in R^{N \times L}\), a distance function, the dimensionality \(d\) of the embedding space, and a user specified perplexity \(u\). All vertices are equalized by the conditional entropies, by determining the Gaussian parameter \(\sigma_i\):
\[  \sum_{j} a_{ij} p_{ji}(\sigma_i) \log( p_{ji}(\sigma_i) ) = \log(u) \]
The implementation, tSNEBH, provided by van der Maaten is implemented in C, with wrappers available in multiple programming languages (MATLAB, Python, Julia, etc.). The implementation accepts any dimension \(d >= 1\), but the complexity and execution grows exponentially with respect to the dimensions of the embedding space.
FItSNE
FItSNE accelerates the tSNE computations by approximating the repulsive forces through nonuniform fast Fourier transforms. The input parameters are similar to tSNEBH with the extra condition that \(1 \leq d \leq 2\); higherdimensional embedding are not supported.
FItSNE emerged last year and is available 1D/2D embedding only. The execution time for 2D embedding with FItSNE is now dominated by the computation of the attractive term. FItSNE suffers further from exponential growth in memory demand, with respect to the dimensionality of the embedding space, due to zero padding for explicit conversion to periodic convolution.
The algorithm is implemented in C with wrappers available in multiple programming languages (similarly to tSNEBH).
New methods & software
SGtSNE
The input parameters for SGtSNE are the stochastic matrix \(\mathbf{P}_c\), the rescaling parameter \(\lambda\), and the dimensions of the embedding space \(1 \leq d \leq 3\).
Our algorithm SGtSNE follows the essential principle of tSNE and removes the barriers in its conventional version. It admits any sparse stochastic graph \({\cal G} = (V, E, \mathbf{P}_c)\), including stochastic 𝑘NN graphs. We generate from \(\mathbf{P}_c\) a parameterized family of stochastic matrices, \(\mathbf{P}_c(\lambda)\), via nonlinear rescaling.
\[ \sum_{j} a_{ij} \, \phi\left( p_{ji}^{\gamma_i} \right) = \lambda \]
Unlike the perplexity \(u\) in the conventional version of tSNE, the parameter \(\lambda\) for SGtSNE exploits the sparsity without imposing any constraint. For the experiments shown in the supplementary material, \(\phi(x) = x\).
tSNEΠ
With tSNEΠ, we accelerate the entire gradient calculation of tSNE and enable Spaceland (3D) embedding in shorter time than 2D embedding with FItSNE or tSNEBH. We overcome challenges at multiple algorithmic stages with innovative approaches to utilizing the matrix structure and the memory architecture in tandem. We not only expedite further the repulsive term computation with dense matrix QQ, but also accelerate the attractive term computation with sparse matrix PQ. We employ swift translocation of {y𝑖} in memory for the Q and vector elements used by both terms.
For sparse computations on modern computers with multilevel memory architectures, memory access latency dominates the execution time. We use data permutations, matrix permutations, and permutation factorization through and through, in order to conform to and utilize the memory hierarchy. Permutations are unaccounted for in arithmetic complexity; they play, however, the key role of increasing data locality, substantially, in every algorithmic building block. We use Π in tSNEΠ to signify the role of permutations as well as the idea and techniques behind.
The algorithm is implemented in C with wrappers available in multiple programming languages (similarly to tSNEBH and FItSNE).
SGtSNE for Embedding of Sparse Stochastic Graphs/Matrices
NonkNN graphs (four realworld networks)
SGtSNE admits sparse stochastic graphs with arbitrary degree distribution, which are not admissable by conventional tSNE. Assume without loss of generality that there are no isolated single nodes. Rendering embeddings at different \(\lambda\) values. Showing better structure in 3D embeddings by tSNEΠ.
Four realworld networks are shown:
Amazon
Embedding of an Amazon
product network, with \(n = 334{,}863\) products and \(m = 1{,}851{,}744\) edges, each edge
connects two products frequently purchased together. The graph is
not admissible to the conventional tSNE. The embedding is by
SGtSNE, with \(l=8{,}000\)
iterations, starting with initial coordinates drawn from a random
Gaussian distribution, including \(l_e =
2{,}000\) early exaggeration iterations with \(\alpha = 12\).
The graph is comprised of multiple smallsized, strongly connected communities. The interconnectivities are weak. The resulting embeddings depict the cluster structure of the product network, both in 2D and 3D.
The degree (log scale) histogram of the 
SGtSNE  2D embeddings  

Variation with the rescaling parameter \(\lambda\)  
\(\lambda=1.0\)  \(\lambda=10.0\) 
\(\lambda=20.0\)  \(\lambda=80.0\) 
SGtSNE  3D embeddings  

Variation with the rescaling parameter \(\lambda\)  
\(\lambda=1.0\)  \(\lambda=10.0\) 
\(\lambda=20.0\)  \(\lambda=80.0\) 
SGtSNE  1D embedding  

Binary  4color scheme  Logarithmic 
The weighted adjacency matrix of the Amazon graph. The vertices are ordered by the 1D embedding of SGtSNEΠ with \(\lambda = 10\). Each pixel corresponds to a \(180\times 180\) matrix block. Three different color schemes are shown. Left: Blocks containing at least one nonzero element are shown in black; MiddleRight: Each pixel is colored based on the number of nonzero elements inside the corresponding matrix block. Two different color schemes are used. The blocks on the offdiagonal areas are extremely sparse, containing zero to three nonzero elements. 
SDDP^{4} clustering result 

The weighted adjacency matrix of the Amazon graph, after SDDP^{4} clustering on the 1D SGtSNE embedding. The vertices are ordered by the clusters identified by SDDP with \(k = 60\). Each pixel corresponds to a \(180\times 180\) matrix block. The number of edges outside of a \(5{,}400\) band (\(1.6\%\) of total vertices, shown in black) is \(41{,}507\) (\(4.5\%\) of total edges). The SDDP algorithm identified \(2{,}978\) communities. The histogram of the community sizes is shown below the the adjacency matrix. 
DBLP
Embedding of author collaboration network, with \(n = 317{,}080\) authors and \(m = 2{,}416{,}812\) edges, each edge connects two authors that have collaborated on at least one article. The number of leaf nodes is \(43{,}181\). The graph is not admissible to the conventional tSNE. The embedding is by SGtSNE, with \(l=8{,}000\) iterations, starting with initial coordinates drawn from a random Gaussian distribution, including \(l_e = 2{,}000\) early exaggeration iterations with \(\alpha = 12\).
The graph is comprised of multiple smallsized, strongly connected communities. The intercommunity edges are weak. The resulting embeddings depict the clustered structure of the product network, both in 2D and 3D.
The degree (log scale) histogram of the 
SGtSNE  2D embeddings  

Variation with the rescaling parameter \(\lambda\)  
\(\lambda=1.0\)  \(\lambda=10.0\) 
\(\lambda=20.0\)  \(\lambda=80.0\) 
SGtSNE  3D embeddings  

Variation with the rescaling parameter \(\lambda\)  
\(\lambda=1.0\)  \(\lambda=10.0\) 
\(\lambda=20.0\)  \(\lambda=80.0\) 
orkut
Spaceland (3D) and Flatland (2D) embedding of the friend social
network orkut
as of 2012. The network has \(n = 3{,}072{,}441\) user nodes, \(m = 237{,}442{,}607\) friendship
links. The graph is not admissible to the conventional tSNE. The
embedding is by SGtSNE using \(8{,}000\) iterations, with initial
coordinates drawn from a Gaussian random distribution.
The degree (log scale) histogram of the orkut social
network. The degree varies from \(1\) at the minimal (\(67{,}767\) leaf nodes) to \(33{,}313\) at the maximal. The
distribution resembles a lognormal distribution.

The 2D embedding takes only 50 minutes on a server with an Intel Xeon E52640v4 CPU and 256 GB of RAM. The vertex locations are structured, with entropy equal to \(7.64\). The leaf nodes (\(67{,}767\) of them) are in the halolike peripheral area. The rest can be roughly put into two hemispherical regions, which may likely correspond to the largest user populations in two geographical areas, one is in and around Brazil, the other is in and around India and some other countries in Asia.
SGtSNE  2D embedding  

Variation with the rescaling parameter \(\lambda\)  
\(\lambda=1.0\)  \(\lambda=10.0\) 
\(\lambda=20.0\)  \(\lambda=80.0\) 
SGtSNE  3D embedding  

Variation with the rescaling parameter \(\lambda\)  
\(\lambda=1.0\)  \(\lambda=10.0\) 
\(\lambda=20.0\)  \(\lambda=80.0\) 
We show 20 of the largest subcommunities on the \(\lambda = 20\) 2D embedding. The communities are clearly separated in the top and bottom hemispheres. Some of the communities are tightly clustered together while others are spread out in the corresponding hemisphere.
SGtSNE  1D embedding  

4color scheme  Logarithmic 
The weighted adjacency matrix of the Orkut graph. The vertices are ordered by the 1D embedding of SGtSNEΠ with \(\lambda = 10\). Each pixel corresponds to a \(1{,}750\times 1{,}750\) matrix block. Each pixel is colored based on the number of nonzero elements inside the corresponding matrix block. Two different color schemes are used. The blocks on the two offdiagonal blocks are extremely sparse, containing zero to twelve nonzero elements. 
LiveJournal
Spaceland (3D) and Flatland (2D) embedding of the
LiveJournal
online community. The network has \(n = 3{,}997{,}962\) user nodes, \(m = 73{,}360{,}340\) friendship
links. The degree varies from \(1\) at the minimal (leaf nodes) to
\(14{,}815\) at the maximal. The
graph is not admissible to the conventional tSNE. The embedding
is by SGtSNE using \(8{,}000\)
iterations, with initial coordinates drawn from a Gaussian random
distribution.
The graph has a hierarchical structure; the leaf nodes comprise the outer halo, while higher levels in the hierarchy reside in the denser areas of the embeddings. The treelike structure is seen better in the 3D embeddings.
The degree (log scale) histogram of the LiveJournal
online community network, with relative frequencies. The degrees
show two peaks; one at 12 degrees and one at 1020 degrees. More
than 18% of the users have one friendship link. There are 485 hub
users with more than 1,000 friendship links. The highest number of
links for a single user is 14,815.

SGtSNE  2D embeddings  

Variation with the rescaling parameter \(\lambda\)  
\(\lambda=1.0\)  \(\lambda=10.0\) 
\(\lambda=20\)  \(\lambda=80\) 
The 3D embeddings depicts better the hierarchical structure of
the LiveJournal
network than the corresponding 2D
embeddings. From outwards to inwards we traverse the network from
leaf nodes (lower levels in the hierarchy) to hub nodes (higher
levels in the hierarchy).
SGtSNE  3D embeddings  

Variation with the rescaling parameter \(\lambda\)  
\(\lambda=1.0\)  \(\lambda=10.0\) 
\(\lambda=20.0\)  \(\lambda=80.0\) 
kNN graphs (two realworld and two synthetic graphs)
We make comparisons with the conventional tSNE using kNN graph. SGtSNE shows superior embedding results over conventional tSNE. The rescaling hyperparameter \(\lambda\) has a larger range of available values comparing to the perplexity, exploring the sparsity of the graph. The perplexity hyperparameter is limited by the number of neighbors at each vertex in the kNN graph.
Two realworld biological networks and two synthetic networks are shown:
 scRNAseq of 1.3 million mouse brain cells
 scRNAseq of 8k Peripheral Blood Mononuclear Cells (PBMCs)
 Lattice graph on Möbius strip
 GEB “braid” book cover
Biological networks
scRNAseq of 1.3 million mouse brain cells
Spaceland(3D) and Flatland(2D) embedding of the kNN graph (\(k = 90\)) associated with \(n =\) 1,306,127 RNA sequences of E18 mouse brain cells. The \(k\)neighbors (\(k = 90\)) are located in the space of the 50 principle components of the top 1,000 variable genes.
Two cell populations are annotated: GABAergic subtype (Sncg, Slc17a8) in blue and VLMC subtype (Spp1, Col15a1) in red. They are better separated from the rest in 3D embedding.
The SGtSNE embedding with \(k = 30\) and \(\lambda = 10\) is visually similar to the tSNE embedding with \(u=30, k = 90\). The graph used in SGtSNE is \(3\times\) sparser than the graph used by tSNE with a corresponding speedup in the execution time of the attractive forces. The 3D embeddings preserve the neighborhood relatively better, both by visual and quantitative inspection.
tSNE  2D embeddings  

Variation with the perplexity parameter \(u\)  
\(u=10, k=30\)  \(u=30, k=90\)  \(u=50, k=150\) 
SGtSNE  2D embeddings  

Variation with the rescaling parameter \(\lambda\)  
\(\lambda = 1\)  
\(\lambda = 10\)  
\(k = 30\)  \(k = 90\)  \(k = 150\) 
tSNE  3D embeddings  

Variation with the perplexity parameter \(u\)  
\(u=10, k=30\)  \(u=30, k=90\)  \(u=50, k=150\) 
SGtSNE  3D embeddings  

Variation with the rescaling parameter \(\lambda\)  
\(\lambda = 1\)  
\(\lambda = 10\)  
\(k = 30\)  \(k = 90\)  \(k = 150\) 
Histograms of the stochastic \(k\)neighbors recall values at all vertices, \(\text{recall}(i) = \sum_j \mathbf{P}_{ji} \cdot b_{ij}\), where \(\mathbf{B}_{k} = [b_{ij}]\) is the adjacency matrix of the \(k\)NN graph of the embedding points \(\{\mathbf{y}_{i} \}\), in Euclidean distance. The stochastic weights \(p_{ji}\) are common to both 2D and 3D embedding (\(u=30, k=90\)). The histogram for the 3D embedding (in red) shows relatively higher recall scores in larger cell population. The quantitative comparison is consistent with that by visual inspection. 
scRNAseq of 8k Peripheral Blood Mononuclear Cells (PBMCs)
Subpopulations of \(n =\) 8,381 peripheral blood mononuclear cells (PBMC8k), shown in complementary views. The subpopulations are found by the SDDP cluster analysis with \(\tau =\) 0.3, from gene expressions of \(d =\) 21,322 genes.
The embeddings by SGtSNE from a much sparse matrix are compelling. The SGtSNE embedding with with \(k = 30\) and \(\lambda = 10\) is visually similar to the tSNE embedding with \(u=50\). The subpopulations identified by the SDDP cluster analysis is consistent with the tSNE and SGtSNE embeeddings. The 3D embeddings show better separation of the clusters.
The \(k\)NN adjacency matrix, \(k = 30\) with rows/columns permuted so that cells in the same subpopulation are placed together. The column clusters are color coded. Each pixel corresponds to a 16 x 16 matrix block. 
tSNE  2D embeddings  

Variation with the perplexity parameter \(u\)  
\(u=10, k=30\)  \(u=30, k=90\)  \(u=50, k=150\) 
SGtSNE  2D embeddings  

Variation with the rescaling parameter \(\lambda\)  
\(\lambda = 0.1\)  
\(\lambda = 1\)  
\(\lambda = 20\)  
\(k = 30\)  \(k = 90\)  \(k = 150\) 
tSNE  3D embeddings  

Variation with the perplexity parameter \(u\)  
\(u=10, k=30\)  \(u=30, k=90\)  \(u=50, k=150\) 
SGtSNE  3D embeddings  

Variation with the rescaling parameter \(\lambda\)  
\(\lambda = 0.1\)  
\(\lambda = 1\)  
\(\lambda = 20\)  
\(k = 30\)  \(k = 90\)  \(k = 150\) 
Synthetic networks
Lattice graph on Möbius strip
Essential neighbor and structural connections in a \(k\)NN graph for a Möbius strip lattice, with \(n = 8{,}192\ ( =256\times 32)\) lattice nodes, are obscured or crumbled by a 2D embedding yet principally captured by a 3D embedding. The embedding of the same graph by the conventional tSNE within the feasible perplexity range suffers more distortions.
The stochastic matrix \(P\) of the \(k\)NN graph, \(k=150\), by Euclidean distance, is displayed in a \(512\!\times\! 512\) pixel array. Each pixel represents a \(16\times 16\) matrix block, its value (color coded) is the total sum of the block elements. 
The embeddings by tSNE and SGtSNE, using \(l=1{,}000\) iterations, including \(l_e = 250\) early exaggeration ones with \(\alpha = 12\). The initial coordinates are drawn from a random uniform distribution with variance equal to 1.
tSNE  2D embeddings  

Variation with the perplexity parameter \(u\)  
\(u=10, k=30\)  \(u=30, k=90\)  \(u=50, k=150\) 
SGtSNE  2D embeddings  

Variation with the rescaling parameter \(\lambda\)  
\(\lambda = 1\)  
\(\lambda = 10\)  
\(\lambda = 80\)  
\(k = 30\)  \(k = 90\)  \(k = 150\) 
tSNE  3D embeddings  

Variation with the perplexity parameter \(u\)  
\(u=10, k=30\)  \(u=30, k=90\)  \(u=50, k=150\) 
SGtSNE  3D embeddings  

Variation with the rescaling parameter \(\lambda\)  
\(\lambda = 1\)  
\(\lambda = 10\)  
\(\lambda = 80\)  
\(k = 30\)  \(k = 90\)  \(k = 150\) 
GEB “braid” book cover
Structural connections of the GEB braid (as seen on the cover of “Gödel, Escher, Bach” book). The 3 letters (G, E, and B) are approximately recovered by SGtSNE with the rescaling parameter \(\lambda\) set to 50 or 150, using \(k = 150\) neighbors.
The stochastic matrix \(P\) of the \(k\)NN graph, \(k=150\), by Euclidean distance, is displayed in a \(512\!\times\! 512\) pixel array. Each pixel represents a \(16\times 16\) matrix block, its value (color coded) is the total sum of the block elements. 
The 3D GEB braid data \(N = 8{,}053\), used to generate the \(k\)NN graph. Extra connections are added on ‘loose’ ends to enforce connectivity. 
The embeddings by tSNE and SGtSNE, using \(l=1{,}000\) iterations, including \(l_e = 500\) early exaggeration ones with \(\alpha = 12\). The initial coordinates are drawn from a random uniform distribution with variance equal to 1.
tSNE  2D embeddings  

Variation with the perplexity parameter \(u\)  
\(u=10, k=30\)  \(u=30, k=90\)  \(u=50, k=150\) 
SGtSNE  2D embeddings  

Variation with the rescaling parameter \(\lambda\)  
\(\lambda = 1\)  
\(\lambda = 10\)  
\(\lambda = 80\)  
\(k = 30\)  \(k = 90\)  \(k = 150\) 
tSNE  3D embeddings  

Variation with the perplexity parameter \(u\)  
\(u=10, k=30\)  \(u=30, k=90\)  \(u=50, k=150\) 
SGtSNE  3D embeddings  

Variation with the rescaling parameter \(\lambda\)  
\(\lambda = 1\)  
\(\lambda = 10\)  
\(\lambda = 80\)  
\(k = 30\)  \(k = 90\)  \(k = 150\) 
Performance
Execution time
Multiple comparisons in execution time for embedding of kNN graphs (\(k =\) 30,90) with 1,306,127 nodes as singlecell RNA sequences of E18 mouse brain cells. Each embedding uses 1,000 iterations and maintains the same approximation level \(10^{6}\). (a) In total time, with the same sparsity parameter \(k\), 3D embeddings by tSNEΠ take less time to finish \(1.5\times\) operations in comparison to 2D embeddings by tSNE, on each of the computers used. (b) In memory usage, tSNEΠ does not augment grid size and thereby uses less memory space. (c) The computation with sparse matrix PQ dominates the tSNE time on computers with lower bandwidth. The embedding by SGtSNE on a sparser graph (\(k = 30\)) reduces the PQ time further, with results comparable to that by tSNE on the graph \(3\times\) as dense. The combined speedup of SGtSNEΠ is 5.3x on the Core i76700 workstation and 4.5x on the Core i74558U MacBook Pro (Late 2013). The code of tSNEΠ is to be optimized; a new version that utilizes the graphics card, if any, is in development. 
Data permutation for repulsive interactions
Sequential execution  Parallel execution (4 threads) 
Comparison in sequential and parallel execution times of

Architecture specification
Architecture specifications of the three computers used in the experiments. The first computer is a MacBook Pro (Late 2013), the second one is a desktop and the third computer is a server. Memory bandwidth is measured with the STREAM benchmark copy function running in parallel. The execution time results match the expected performance due to the bandwidth and clock ratios between each computer. 
Supplementary: relating the adjacency matrix to the graph embedding matrix in SGtSNE^{1}
For graph embedding with SGtSNE
, a graph \(G\) is represented and provided by
its adjacency matrix \(\mathbf{A}\) at input, not limited to
a pairwise distance or similarity matrix as in SNE
or
tSNE
. Matrix \(\mathbf{A}\) may be very sparse for a
large graph, while a distance/similarity matrix is full, with
alltoall connections. In SGtSNE
, similar to
SNE
, we convert the input matrix into a stochastic
guidance matrix \(\mathbf{P}_c\)
for graph embedding, with a distinction. We allow \(\mathbf{P}_c\) to preserve the
sparsity in network connectivity, i.e., \(\mathbf{P}_c\) has the same
zeroandnonzero pattern as \(\mathbf{A}\). We maintain that graph
embedding matrix \(\mathbf{P}_c\)
is columnstochastic, and the nonzero elements in each column are
not constant.
We describe the relationship between \(\mathbf{P}_c\) and \(\mathbf{A}\) in two typical application cases. We may assume that \(\mathbf{A}\) has no zero columns.
Matrix \(\mathbf{A}\) is nonnegatively and unequally weighted, a higher weight on an edge \((u,v)\) indicating a closer affinity between node \(u\) and node \(v\). In this case, \(\mathbf{P}_c\) can be simply a columnstochastic version of \(\mathbf{A}\), namely, the columns are scaled so that each and every column sums to \(1\).
Matrix \(\mathbf{A}\) is unweighted, binary valued with \(0\)s and \(1\)s. In this case, we introduce stochastic weights in \(\mathbf{P}_c\) to capture and reflect the similarity or discrepancy in local topological connectivity in graph \(G\). There are more than one ways to capture such variation. We describe a simple method deployed in the current
SGtSNE
implementation.^{2} For each pair of linked nodes \(u\) and \(v\), we assign the following Jaccard affinity score on the edge \((u,v)\), \[\mathbf{P}_c(u,v) = c(v) \cdot \mathbf{A}(u,v) \cdot \frac{  \mathcal{N}(u) \cap \mathcal{N}(v)  } {  \mathcal{N}(u) \cup \mathcal{N}(v)  },\] where \(\mathcal{N}(v)\) denotes the onehop neighborhood of node \(v\), \(S\) denotes the cardinality of a discrete set \(S\), and \(c(v)\) normalizes the sum of column\(v\) to \(1\). The affinity score of \(u\) to \(v\) is higher if \(u\) shares more common neighbors with \(v\) among the nodes adjacent to either \(u\) or \(v\). The neighborhood definition can be changed, such as from one hop to two hops.
When the variation in the Jaccard scores is small, it indicates a high degree of symmetry in graph \(G\). For instance, the scores are constant over any clique, any 2D torus, or the graph of fullerene \(C_{60}\). Such graphs may be embedded well with the geometric embedding method via the graph Laplacian.
There is a meaningful way to convert Case (b) to Case (a) at
the userinterface level, without invoking the internal
differentiation in local topology. For example, \(\mathbf{A}+\mathbf{A}^2\) represents
the node connectivity within twohop walks on \(G\). Or one may use a graph augmented
by \(\sum_{k=1}^{h} \beta_k
\mathbf{A}^k\) within \(h\)hops, \(h>1.\) The hopweights \(\beta_k\) may decrease with the
increase of \(k\). It is
beneficial in a discriminative or exploratory graph analysis to
use SGtSNE
not only on graph \(G\) but also on a few other graphs
induced from or related to \(G\).
In an extreme case, one can get from \(G\) the geodesic distance matrix
without or with thresholding on the pairwise distances.
In any case, we also associate \(\mathbf{P}_c\) with a neighborhood scaling parameter denoted as \(\lambda\). The user can alter the default setting. The parameter adjusts the effective neighborhood range by nonlinearly rescaling the weights to concentrate more on closer neighbors or spread out to reach farther neighbors.
Precursor works
L. van der Maaten and G. Hinton, “Visualizing data using tSNE,” Journal of Machine Learning Research, vol. 9, no. Nov, pp. 2579–2605, 2008.
L. van der Maaten, “Accelerating tSNE using treebased algorithms,” Journal of Machine Learning Research, vol. 15, no. Oct, pp. 3221–3245, 2014. [Code]
G. C. Linderman, M. Rachh, J. G. Hoskins, S. Steinerberger, and Y. Kluger, “Fast interpolationbased tSNE for improved visualization of singlecell RNAseq data,” Nature Methods, vol. 16, no. 3, pp. 243– 245, 2019.
Dimitris Floros, Tiancheng Liu, Nikos Pitsianis and Xiaobai Sun, “Sparse Dual of the Density Peaks Algorithm for Cluster Analysis of HighDimensional Data,” Proc. 2018 IEEE High Performance Extreme Computing Conference (HPEC), 2018.
Data sources
J. Yang and J. Leskovec, “Defining and evaluating network communities based on groundtruth,” Knowledge and Information Systems, vol. 42, no. 1, pp. 181–213, 2015.
“Transcriptional profiling of 1.3 million brain cells with the Chromium Single Cell 3’ solution,” 10x Genomics LIT000015 Chromium Million Brain Cells Application Note, 2017.
G. X. Y. Zheng, J. M. Terry, P. Belgrader, P. Ryvkin, Z. W. Bent, R. Wilson, S. B. Ziraldo, T. D. Wheeler, G. P. McDermott, J. Zhu, M. T. Gregory, J. Shuga, L. Montesclaros, J. G. Underwood, D. A. Masquelier, S. Y. Nishimura, M. SchnallLevin, P. W. Wyatt, C. M. Hindson, R. Bharadwaj, A. Wong, K. D. Ness, L. W. Beppu, H. J. Deeg, C. McFarland, K. R. Loeb, W. J. Valente, N. G. Ericson, E. A. Stevens, J. P. Radich, T. S. Mikkelsen, B. J. Hindson, and J. H. Bielas, “Massively parallel digital transcriptional profiling of single cells,” Nature Communications, vol. 8, p. 14049, 2017.
Acknowledgments
The experiments were carried out with the assistance of Pantazis Vouzaksakis and Xenofon Theodoridis.