This paper proposes a two-level processing scheme for three-dimension-image sensing. The first level processing selects only spatial regions needed for a smart monitoring task to reduce the total volume of data traffic. The second level processing integrates multiple (physical) image sensors into a virtual one to improve the delay and jitter performance in the realtime transmission of data from sensors to the cloud server. We develop a prototype system to implement the proposed scheme. Our demonstration validates that the proposed processing scheme works better than the benchmarks which do not adopt the two-level processing.