Hello, I am training my dataset on a whisper model. During my first epoch, it has been consistently stalling around every 1000 files and flies by through the files in between. It feels like a bit of a weird coincidence as well. It stalled at 5998, 6994, 7995, etc. Never exactly 1000 files, but very close to it. After a minute or two it continues. Is there a reason why? I got rid of the num_proc argument when defining my dataset earlier since num_proc = 4 was crashing my run-throughs, so I’m not sure if that had to do with anything. It’s just so interesting that it’s so consistent.
Any advice would be greatly appreciated!
For context my GPU is NVIDIA GeForce GTX 1650