Most previous super-resolution (SR) approaches are implemented with two individual cascade steps, image registration and image fusion. They face the dilemma that the fusion demands accurate motion estimation, while the lack of image information of high quality, which is the output of the fusion step, leads to inaccurate motion fields estimation. In this paper, we pose SR reconstruction as Bayesian state estimation which results in that image alignment and image fusion are combined into one unified framework. We build a part-based face model that encodes the structural information of human faces. The prior information from the face model is incorporated into both registration and fusion process of superresolution, and high resolution images are reconstructed by using Sequential Monte Carlo based algorithm. Experiments performed on synthesized frontal face sequences show that the proposed approach gains superior performance in registration as well as reconstruction.