Derivation based on Pastor-Satorras, R., and Wagensberg, J., PHYSICA A (Statistical and Theoretical Physics), Volume 251, No.3 and 4, March 14, 1998, p. 291-302.(Link)
Shannon Entropy:
H(P) = −S(k) [p(k) ln p(k)]
Probabililities, by definition sum to unity:
S(k) = ln p(k) = 1
An element of order k, has a magnitude ℓ(k). N(k) = ℓ(k) / ε indistinguishable atoms of size ε, arranged in a certain way, comprise the element. The information needed to specify the arrangement of the N(k) atoms in the element is equivalent to selecting N(k) objects with the same probability. From information theory[Shannon, 1948] this information is ln N(k). The generating information for specifying an element of order k is ℓ(k) = ln N(k) = ln [ℓ(k)/ε]. The average information over the entire iteration process P is:
= S(k) p(k) I(k) =S(k) p(k) ln [ℓ(k)/ε]
F = −S(k) [p(k) ln p(k)] + β { −S(k) p(k) ln [ℓ(k) / ε ]}+ β′ [1 − S(k) p(k)]
∂F/∂pk = 0 = −ln p(k) − 1 − β ln [ℓ(k) / ε] − β′ = 0.
p(k) = exp {− 1 − β′ [ℓ(k) / ε] −β}
p(k) ~ n(k),
n(k) = const. × (ℓ(k) / ε) − β
“[T]he occupation numbers scale as a power of ℓ(k), which implies a self-similar behaviour” (Pastor-Satorras and Wagensberg, 1998).
Thus, Fractals are evidence for maximization of entropy (T-Rex's conclusion).