Testing the capabilities of AI scientists in real-world scenarios