Shaodi YOU - Publications

Multi-view Rectification of Folded Documents

Appeared in IEEE TPAMI

Shaodi You,Yasuyuki Matsushita, Sudipta Sinha, Yusuke Bou and Katsushi Ikeuchi

      Journal paper (1.5MB)
      Extra results (4.2MB)
      Data: Input and Groundtruth (30MB)
      Code for evaluation (Coming soon)
      BibTex

Origami

Abstract: Digitally unwrapping paper sheets is a crucial step for document scanning and accurate text recognition. This paper presents a method for automatically rectifying curved or folded paper sheets from a small number of images captured from different viewpoints. Unlike previous techniques that require either an expensive 3D scanner or over-simplified parametric representation of the deformations, our method only uses a few images and is based on general developable surface model that can represent diverse sets of deformation of paper sheets. By exploiting the geometric property of developable surfaces, we develop a robust rectification method based on ridge-aware 3D reconstruction of the paper sheet and L1 conformal mapping. We evaluate the proposed technique quantitatively and qualitatively using a wide variety of input documents, such as receipts, book pages and letters.

Ruler

Developable surfaces with underlying rulers (lines with zero Gaussian curvature) and fold lines (ridges) shown as dotted and solids lines respectively. Examples of (a) smooth parallel rulers, (b) smooth rulers not parallel to each other and (c) rulers and ridges in arbitrary directions.

Reconstruction

Ridge-aware 3D surface reconstruction: we extend the Poisson surface reconstruction method by incorporating ridge constraints and by adding robustness to outliers.

RVisual

Rectification results from combination of methods. Acronyms RA and Po denote our ridge-aware method and Poisson reconstruction respectively. L1 denotes our $\ell_1$ conformal mapping method with non-local constraints; L2 indicates LSCM~\cite{Brown07} and Geo indicates geodesic unwrapping~\cite{Zhang08}. Po + L2 produced gross failures and is thus not compared.

Visual

Our results: Original images are shown in rows 1 and 3 and our rectification results are shown in rows 2 and 4

RGlobal

Global distortion evaluation metric: Abbreviations are consistent with the text.

RLocal

Local distortion evaluation metric: Abbreviations are consistent with the text.

SGlobal

SLocal

Comparison of the global distortion metric: between our method (top) and Zhang~\etal~\cite{Zhang08} and Brown~\etal\cite{Brown07} with varying point density and noise. Here lower values indicate higher accuracy. (b) Frequency distribution of local distortion metrics for the associated experiments. Our method is more accurate when input point are sparser or have more noise.