Text this: Face-voice association towards multimodal-based authentication using modulated spike-time dependent learning