The last stage of any type of automatic surveillance system is the interpretation of the acquired information from its sensors. This work focuses on the interpretation of motion pictures taken from a surveillance camera, i.e.; image understanding. A prototype of a fuzzy expert system is presented which can describe in a natural language like manner, simple human activity in the field of view of a surveillance camera. The system is comprised of three components: a pre-processing module for image segmentation and feature extraction, an object identification fuzzy expert system (static model), and an action identification fuzzy expert system (dynamic temporal model). The system was tested on a video segment of a pedestrian passageway taken by a surveillance camera.