We've added support for vision fine-tuning

i want to know whether one training sample can support more over 10 images? because according to docs i cannot understand the limits of imags number in one training sample(one chatting record). Hoping to your reply,thanks very much