We found that 50% of LAION-2B samples contain the Parrot Captions (concurent text in captions and print pixels) and the Parrot Captions do huge impact on CLIP-style Vision-Language Alignment... I'm a researcher at Shanghai AI Lab working on data-centric research. Before that I obtained bachelor and master degree at Sun Yat-Sen University (SYSU).
  • 0
  • 0
Interest Score
1
HIT Score
0.00
Domain
linyq17.github.io

Actual
linyq17.github.io

IP
185.199.108.153, 185.199.109.153, 185.199.110.153, 185.199.111.153

Status
OK

Category
Company
0 comments Add a comment