iLAVSE is a deep-learning-based audio-visual project that addresses three practical issues often encountered in implementing AVSE systems, including the requirement for additional visual data, audio-visual asynchronization, and low-quality visual data.