# dang-sunburst

The sunburst chart on this page displays a distribution of data along two independent dimensions, X and Y.

To generate this data, random samples were drawn from a joint PDF of two random variables, one normally distributed, and one distributed according to a gamma distribution.

I used Python to do this:
```import numpy as np
from numpy.random import randn
import pandas as pd
from scipy import stats
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns

# Generate discrete joint distribution

fig = plt.figure()

x = np.random.normal(3,1,1500)
y = stats.gamma(3).rvs(1500)

H,xedges,yedges = np.histogram2d(y,x,[10,10])
X, Y = np.meshgrid(xedges,yedges)

mpl.rc("figure", figsize=(6, 6))
pcolormesh(X,Y,H);
ax = gca();
ax.set_xlim([0,10])
ax.set_ylim([0,6])
ax.set_xlabel('X');
ax.set_ylabel('Y');

mpl.rc("figure", figsize=(6, 6))
ax.set_xlim(0,10)
ax.set_ylim(0,10)
ax.set_xlabel('X')
ax.set_ylabel('Y')

```
which results in the joint distribution shown in the plot below: Next, we map the bins of each dimension, X and Y, to a set of variables. In this case, we generated 10 bins for X and 10 bins for Y. This is easily done with some code calling the Numpy `histogram2d` function. This results in a binned, 10x10 grid: This data is displayed in the sunburst chart, with the x dimension represented in the inner ring, and the y dimension represented in the outer ring (applied to each arc).

Because the sunburst chart groups things categorically, we are converting the quantitative X and Y scales to groups according to bins. We arbitrarily label the bins, but maintain their order (which is important).

The final data is an array that looks like this:
```{
'x' :
'y' :
'value' :
}
```
The value is provided by the matrix `H` of counts per bin, returned by `np.histogram2d`.