Multimodal data integration combines different data modalities to improve predictive and classification performance. In biology, multi-omics profiling has become a powerful tool for applications such as cancer patient stratification. However, integration of multi-omics data remains challenging because of missingness and inherent heterogeneity. Methods such as imputation and sample exclusion often rely on strong assumptions that could lead to information loss or distortion. To address these limitations, we propose MIND (Multimodal Integration with Neighbourhood-aware Distributions), which learns patient-specific embeddings from incomplete multi-omics data using a multimodal Variational Autoencoder with a data-driven prior. We inject neighbourhood structure of the observed dataset, encoded as affinity matrices, into the prior, penalising latent configurations when neighbourhood structures in data and latent spaces diverge. MIND handles high missing rates, unbalanced missingness patterns, and low signal-to-noise ratios robustly. Compared with existing integration methods, MIND achieves better performance on downstream tasks on both synthetic and real data.