Image processing requires more computational power and data throughput than most conventional processors can provide. Designing specific hardware can improve execution time and achieve better performance per unit of silicon area. A field-programmable-gate-array- (FPGA-) based configurable systolic architecture specially tailored for real-time window-based image operations is presented in this paper. The architecture is based on a 2D systolic array of 7 × 7 configurable window processors. The architecture was implemented on an FPGA to execute algorithms with window sizes up to 7 × 7 , but the design is scalable to cover larger window sizes if required. The architecture reaches a throughput of 3.16 GOPs at a 60 MHz clock frequency and a processing time of 8.35 milliseconds for 7 × 7 generic window-based operators on 512 × 512 gray-level images. The architecture compares favorably with other architectures in terms of performance and hardware utilization. Theoretical and experimental results are presented to demonstrate the architecture effectiveness.