Ive been summarizing this sort of thing as "having taste", if your still doing this sort of research maybe find a bunch of bad vibe coders give them this task again. Then give them reading material(in the form of hour's of youtube playlists) from different programming paradigms, randomize the order they are given each playlist, tell the vibe coders to implement the "programming taste" of the paradigm they last, see what happens to these benchmarks.
its unclear if this is you giving me a link, or asking to try my hand I designing a playlist for vibe coders to learn or asking me to be a vibe coder for it; what exactly is happening?
I'm designing an experiment where participants solve a task with AI. half get a cheatsheet with common footguns, half don't.
your playlist idea maps to what I call "multiphase" (didn't put lots of effort into naming it yet tbh) where phase 1 acquires the vocabulary for domain concerns with LLMs and then phase 2 using it in directed prompts. I have enough data for phase 2 already (with a 100% success rate across 3 model families when the right words are in the prompt) what's still missing is phase 1
it's not about dbs I didn't decide on the task yet. currently mining for the ones that have typical production/scale footguns which LLMs often miss but get resolved via proper vocabulary each time.
Ive been summarizing this sort of thing as "having taste", if your still doing this sort of research maybe find a bunch of bad vibe coders give them this task again. Then give them reading material(in the form of hour's of youtube playlists) from different programming paradigms, randomize the order they are given each playlist, tell the vibe coders to implement the "programming taste" of the paradigm they last, see what happens to these benchmarks.
this is happening btw. (slightly different form tho) are you interested?
its unclear if this is you giving me a link, or asking to try my hand I designing a playlist for vibe coders to learn or asking me to be a vibe coder for it; what exactly is happening?
my bad for vibe posting. all three actually
I'm designing an experiment where participants solve a task with AI. half get a cheatsheet with common footguns, half don't.
your playlist idea maps to what I call "multiphase" (didn't put lots of effort into naming it yet tbh) where phase 1 acquires the vocabulary for domain concerns with LLMs and then phase 2 using it in directed prompts. I have enough data for phase 2 already (with a 100% success rate across 3 model families when the right words are in the prompt) what's still missing is phase 1
lmk if you're interested in any of these
I wont mind trying my hand writing a cheat sheet or trying to teach vibe coders; but I know nothing about databases.
Shouldnt you hedge your bets on cheatsheets by having maybe 3 different versions?
it's not about dbs I didn't decide on the task yet. currently mining for the ones that have typical production/scale footguns which LLMs often miss but get resolved via proper vocabulary each time.
that would be the very next step! This one was just what I could do in a relatively short amount of time with the resources I had.
Depending on how this one is received, I'd seriously consider it.