V
21

Everyone says you need huge data sets for good AI, but my small project worked fine with just 500 examples.

I was building a tool to sort support tickets and kept hitting a wall. I tried a trick from an old paper, feeding the model the same data three different ways. Accuracy jumped from 70% to 88% in a week. Has anyone else gotten results with a small, smart data set?
4 comments

Log in to join the discussion

Log In
4 Comments
hernandez.stella
Yeah, I had that happen with a text classifier. I got way better results by writing ten different versions of each training example myself, like changing a few words or the sentence order. It made the model way less picky about how things were phrased.
6
aaronm55
aaronm551mo ago
Did you find it helped with sarcasm too?
7
andrewwilliams
Smart data beats big data sometimes.
1
craig.viola
That's a solid point about making your own training examples. Makes me wonder about the data itself though. Where does it even come from. A lot of those big datasets are just scraped from the web. Full of junk and weird biases. Cleaning that up by hand for a smaller, smarter set seems like the real win.
4