The Meta Paper

2025-04-27 18:43 default

About a month ago I found a paper on ACM about how Meta uses an automated LLM generation strategy to increase their testing code-coverage for Facebook and Instagram. Work for me has been a bit gnarly as of late, but I got a break finally! We fixed the glitch (and are now working on removing the Band-Aid and implementing something more permanent), so work is back to normal. Since that’s back to normal, I was finally able to read the paper, which was actually fairly interesting.

Meta’s strategy here was not to have an LLM create tests from scratch, or even remove their software developers or testers from the process, but rather to have their existing test suite enhanced by a couple LLMs. The tests their LLMs generate are then run through a fitness gauntlet to ensure the tests work, are improving code-coverage, and are reliable. Afterwards, they get sent to the development team for approval or rejection. They’ve noticed that about 25% of tests generated make it to production code and that it does improve testing.

The two most interesting things I got from this paper were:

This setup seems to create good regression tests, which are important
When I found this paper a month ago, I ran it through Gemini 2.5 pro (experimental) to get some suggestions on how to use this information in my own work. At the time I read the suggestions and considered them suspect until I had a chance to actually read the source, which I do with just about anything it gives me. Today I went back and reviewed those suggestions and they’re pretty good.

So, if I have a chance I may see how I can improve testing with my own projects with admittedly a very watered down versions of this method, but still see if I can maybe improve my tests nonetheless. I also need to remain ever vigilant with these tools as well because, even though the Gemini suggestion summary was good, it still did also tell me the LTE downlink was an OFDM in a different previous chat; which is, from my understanding, not entirely accurate.

Other Developments and Continuing Research Direction

Since my last post, Google added a feature to NotebookLM that will automatically seek out additional sources. So I may have to try that out sometime for my “AI Harms and Benefits” research. I’m a little skeptical that it’ll take me in a useful direction, but then I was also skeptical it could essentially summarize the Meta paper and give me good suggestions. So we’ll see!

As for continuing research, I really liked this look at testing. It definitely demonstrates a benefit of AI/LLM usage. I am somewhat more skeptical that LLM usage in code generation will yield such positive benefits however, and I may need to see what’s out there regarding “vibe coding”. Flappy Bird clones abound, but once a code base grows to the size of the Linux kernel I feel like any LLM system is going to run into issues. Anecdotally that’s also what I’m hearing. So I may need to check that out and also perhaps expand my horizons. Earlier I saw a potential benefit in the justice system, which surprised me, so that may also be an avenue to explore as well.

Finally, if I have time, I may also have to see what is going on with LLM training bot tar pits and deterrence systems. As an LLM user, how might that affect me? As a website or web property operator, how does the apparent unabated and overwhelming assault by the training bots affect them? If I may, as an aside, I would like to say that from this perspective it appears Google may be a better netizen than most and I appreciate that. I don’t want to miss out on knowledge on account of a web site operator asking a bot not to visit. However, on the other hand, we were asked politely and we should honor that. So if robots.txt says “please don’t visit”, I would expect a bot to not visit. Also, if a bot is allowed to visit (ala latest reports from the Wikimedia Foundation), the bot should use best practices to not increase the website data transfer 50% per month. They offer a torrent for God’s sake! Of the text! Of the Website! You morons!

Until then!