Abstract: We present a thin camera which encodes 3D scenes into the 2D image sensor through a single piece of thin custom-fabricated microlens array and reconstructs the 3D scenes through a deep learning framework. The microlens array is designed to have a balanced frequency support among different spatial frequency. The deep learning framework is assisted with an adversarial learning model, and has a high speed in reconstruction. We validate the system in both simulations and experiments. Our thin 3D camera demonstrates the great potential of combining custom-designed micro-optics and deep learning algorithms in computational imaging.