Mean field theory
From Academic Kids

A manybody system with interactions is generally very difficult to solve exactly, except for extremely simple cases (Gaussian field theory, 1D Ising model.) The great difficulty (e.g. when computing the partition function of the system) is the treatment of combinatorics generated by the interaction terms in the Hamiltonian when summing over all states. The goal of mean field theory (MFT, also known as selfconsistent field theory) is to resolve these combinatorial problems.
The main idea of MFT is to replace all interactions to any one body with an average or effective interaction. This reduces any multibody problem into an effective onebody problem. The ease of solving MFT problems means that some insight into the behavior of the system can be obtained at a relatively low cost.
In field theory, the Hamiltonian may be expanded in terms of the magnitude of fluctuations around the mean of the field. In this context, MFT can be viewed as the "zerothorder" expansion of the Hamiltonian in fluctuations. Physically, this means a MFT system has no fluctuations, but this coincides with the idea that one is replacing all interactions with a "mean field". Quite often, in the formalism of fluctuations, MFT provides a convenient launchpoint to studying first or second order fluctuations.
In general, dimensionality plays a strong role in determining whether a meanfield approach will work for any particular problem. In MFT, many interactions are replaced by one effective interaction. Then it naturally follows that if the field or particle exhibits many interactions in the original system, MFT will be more accurate for such a system. This is true in cases of high dimensionality, or when the Hamiltonian includes longrange forces.
While MFT arose primarily in the field of Statistical Mechanics, it has more recently been applied elsewhere, for example for doing Inference in Graphical Models theory in artificial intelligence.
Formal approach
The formal basis for mean field theory is the Bogoliubov inequality. This inequality states that the free energy of a system with Hamiltonian
 <math>\mathcal{H}=\mathcal{H}_{0}+\Delta \mathcal{H}<math>
has the following upper bound:
 <math>F \leq F_{0} \equiv \langle \mathcal{H} \rangle_{0} T S_{0}<math>
where the average is taken over the equilibrium ensemble of the reference system with Hamiltonian <math>\mathcal{H}_{0}<math>. In the special case that the reference Hamiltonian is that of a noninteracting system and can thus be written as
 <math>\mathcal{H}_{0}=\sum_{i=1}^{N}h_{i}\left( \xi_{i}\right)<math>
where <math>\xi_{i}<math> is shorthand for the degrees of freedom of the individual components of our statistical system (atoms, spins and so forth). One can consider sharpening the upper bound by minimising the right hand side of the inequality. The minimizing reference system is then the "best" approximation to the true system using noncorrelated degrees of freedom, and is known as the mean field approximation.
For the most common case that the target Hamiltonian contains only pairwise interactions, i.e.,
 <math>\mathcal{H}=\sum_{(i,j)\in \mathcal{P}}V_{i,j}\left( \xi_{i},\xi_{j}\right)<math>
where <math>\mathcal{P}<math> is the set of pairs that interact. The minimizing procedure can be carried out formally. Define <math>{\rm Tr}_{i}f(\xi_{i})<math> as the generalized sum of the observable <math>f<math> over the degrees of freedom of the single component (sum for discrete variables, integrals for continuous ones). The approximating free energy is given by
<math>F_{0} = \,\!<math>  <math>{\rm Tr}_{1,2,..,N}\mathcal{H}(\xi_{1},\xi_{2},...,\xi_{N})P^{(N)}_{0}(\xi_{1},\xi_{2},...,\xi_{N})<math> 
<math>+kT \,{\rm Tr}_{1,2,..,N}P^{(N)}_{0}(\xi_{1},\xi_{2},...,\xi_{N})\log P^{(N)}_{0}(\xi_{1},\xi_{2},...,\xi_{N})<math> 
where <math>P^{(N)}_{0}(\xi_{1},\xi_{2},...,\xi_{N})<math> is the probability to find the reference system in the state specified by the variables <math>(\xi_{1},\xi_{2},...,\xi_{N})<math>. This probability is given by the normalized Boltzmann weight
 <math>P^{(N)}_{0}(\xi_{1},\xi_{2},...,\xi_{N})=\frac{1}{Z^{(N)}_{0}}e^{\beta \mathcal{H}_{0}(\xi_{1},\xi_{2},...,\xi_{N})}=\prod_{i=1}^{N}\frac{1}{Z_{0}}e^{\beta h_{i}\left( \xi_{i}\right)}
\equiv \prod_{i=1}^{N} P^{(i)}_{0}(\xi_{i})<math>.
Thus
 <math>F_{0}=\sum_{(i,j)\in\mathcal{P}} {\rm Tr}_{i,j}V_{i,j}\left( \xi_{i},\xi_{j}\right)P^{(i)}_{0}(\xi_{i})P^{(j)}_{0}(\xi_{j})+
kT \sum_{i=1}^{N} {\rm Tr}_{i} P^{(i)}_{0}(\xi_{i}) \log P^{(i)}_{0}(\xi_{i}).<math> In order to minimize we take the derivative with respect to the single degreeoffreedom probabilities <math>P^{(i)}_{0}<math> using a Lagrange multiplier to ensure proper normalisation. The end result is the set of selfconsistency equations
 <math>P^{(i)}_{0}(\xi_{i})=\frac{1}{Z_{0}}e^{\beta h_{i}^{MF}(\xi_{i})}\qquad i=1,2,..,N<math>
where the mean field is given by
 <math>h_{i}^{MF}(\xi_{i})=\sum_{\{j(i,j)\in\mathcal{P}\}}Tr_{j}V_{i,j}\left( \xi_{i},\xi_{j}\right)P^{(j)}_{0}(\xi_{j})<math>
Example
Consider the Ising model on an Ndimensional cubic lattice. The Hamiltonian is given by
 <math> H = J \Sigma^{'} \mathbf{s_i} \mathbf{s_{i'}} <math>
where the <math> \Sigma^{'} <math> indicates summation over nearest neighbors, and <math>\mathbf{s_i} <math> and <math>\mathbf{s_{i'}}<math> are neighboring Ising spins.
Let's transform our spin variable by introducing the fluctuation from its mean value <math> \langle\mathbf{s}\rangle <math>. We may rewrite the Hamiltonian:
 <math> H = J \Sigma^{'} (\mathbf{\Delta(s_i) + \langle s_i \rangle}) (\mathbf{\Delta(s_{i'})+ \langle s_{i'}\rangle}) <math>
where we define <math> \mathbf{\Delta(s) \equiv s  \langle s\rangle} <math>; this is the fluctuation term of the spin. If we multiply out the RHS, we obtain one term that's entirely dependent on the mean values of the spins, and independent of the spin configurations. This is the trivial term, which does not affect the partition function of the system. The next term is the one involving the product of the mean value of the spin and the dynamic fluctuation value. Finally, the last term involves a product of two fluctuation values.
If fluctuations are small, we may neglect this last term. As per the above arguments, when the fluctuations are small, then MFT should work 'better', from an intuitive standpoint.
 <math> H \approx J \Sigma^{'} (\mathbf{ 2 \Delta(s_i) \langle s_{i'}\rangle }) <math>
Again, the summand can be reexpanded to
 <math> \mathbf{(s_i  \langle s_i\rangle) \langle s_{i'}\rangle} <math>
The only term that matters from the partition function's point of view is the first product.
 <math> \mathbf{(s_i) \langle s_{i'}\rangle} <math>
By symmetry arguments, the mean value of each spin is siteindependent. We can replace <math> \langle\mathbf{s_{i'}}\rangle <math> with <math>\langle \mathbf{s}\rangle <math>.
We are still stuck with a double summation over neighboring spins, yet the summand involves only one site of each neighbor. Roughly speaking, we count 2d bonds (where d is the dimensionality of the cubic lattice) for each site. But since each bond participates in two spins, we would be overcounting by a factor of 2 if we gave each site a multiplicity of 2d. Therefore, the Hamiltonian becomes
 <math> H = 2dJ\langle\mathbf{s}\rangle \Sigma_i \mathbf{s_i} <math>
At this point, the Hamiltonian has been reduced to that of a singlebody problem. The drawback is that now the effective coupling constant now contains the mean value of the summand.
Substituting this Hamiltonian into the partition function, and solving the effective 1D problem, we obtain
 <math> Z = (2 \cosh(\frac{2dJ}{T} \langle \mathbf{s}\rangle))^{N} <math>
where <math> N <math> is the number of lattice sites. This is a closed and exact expression for the partition function of the system. We may obtain the free energy of the system, and calculate critical exponents.
MFT is known under a great many names and guises. Similar techniques include BraggWilliams approximation, Bethe approximation, Landau theory.