Clustering swap prediction for image-text pre-training
It is essential to delve into the strategy of multimodal model pre-training, which is an obvious impact on downstream tasks. Currently, clustering learning has achieved noteworthy benefits in multiple methods. However, due to the availability of open image-text pairs, it is challenging for multimoda...
保存先:
主要な著者: | Fayou, Sun, Meng, Zuqiang, Ngo, Hea Choon, Sek, Yong Wee |
---|---|
フォーマット: | 論文 |
言語: | English |
出版事項: |
Nature Research
2024
|
オンライン・アクセス: | http://eprints.utem.edu.my/id/eprint/27536/2/0130221062024105857.PDF http://eprints.utem.edu.my/id/eprint/27536/ https://www.nature.com/articles/s41598-024-60832-x#:~:text=We%20argue%20that%20the%20advantages,can%20be%20dynamically%20adjusted%20with https://doi.org/10.1038/s41598-024-60832-x |
タグ: |
タグ追加
タグなし, このレコードへの初めてのタグを付けませんか!
|
類似資料
-
Associating multiple vision transformer layers for fine-grained image representation
著者:: Sun, Fayou, 等
出版事項: (2023) -
Loop and distillation: Attention weights fusion transformer for
fine‐grained representation
著者:: Sun, Fayou, 等
出版事項: (2023) -
Adopting multiple vision transformer layers for fine-grained image representation
著者:: Sun, Fayou, 等
出版事項: (2023) -
Adopting attention and cross-layer features for fine-grained representation
著者:: Sun, Fayou, 等
出版事項: (2022) -
Snake-swapping for better insight
著者:: Times Two, Malaysia
出版事項: (1988)