Exploring Code Quality, LLM Trends, and Metrics in the Tech Sphere

Aug 26, 2023

Guess what I've been up to last week? Diving deep into the fascinating world of awesome code! 🤓 It's like exploring hidden treasure chests of programming wisdom. Speaking of which, I stumbled upon two absolute gems that I've gotta share with you. First off, there's this rad piece by the coding guru herself, @techgirl1908 (Angie). Her insights go beyond code reviews and resonate with the vibes of "The Zen of Python." And for those who've struggled to find killer code samples, don't fret! Hackernoon's got a thread that's basically a goldmine for coding excellence. 💎

Now, hold on tight because the wild world of Large Language Models (LLMs) is cooking up some crazy cool stuff. If you're up to date on Twitter or hanging with the trendsetters, you'll bump into buzzwords like hallucinations and context learning. But wait, there's more! Keep your eyes peeled for epic developments in things like multimodality and fresh architectures that go beyond Transformers. Shoutout to Chip Huyen for breaking down all the exciting frontiers in the LLM realm here. Oh, and if you're into spotting patterns in LLM use, Eugene Yan's got you covered with his intriguing piece. 🤖

Okay, let's talk real talk. If you're riding the OpenAI wave, you've probably noticed that sometimes things get a bit wonky in the output department. But don't worry, it's all part of the experimentation ride. Pinning the blame on one or two things would be like blaming a single cloud for the whole rainstorm. One of the major culprits here? AI drift! For instance, GPT-4's accuracy in the USMLE went from "You got this!" 86.6% in March to a "Maybe get a tutor?" 82.4% by June. Want the deets? There's a nifty paper tracking these shifts over time. Or if you're more into the easy breezy, @JtheAIwhisperer's got a sweet breakdown of AI drift and its funky manifestations. 🌊

Hold onto your keyboards because I've also been knee-deep in self-service metrics lately. Some metrics are like a warm hug, while others... not so much. I've personally developed a strong dislike for the signup metric. It's like that one friend who's always overly dramatic. Andrew Chen's weighed in on metrics, too (link alert!). Oh, and I stumbled upon a riveting convo between Kyle Poyar from OpenView and Greg Kogan, Pinecone's VP of Marketing. Pinecone, the database dream team, just scored a massive investment and their product's in such high demand that signups are rolling in at a ludicrous pace (even though they think signups aren't the real MVPs of growth!). You've gotta check out Greg's chat—it's like a playbook for growth magic! 🪄

As we wade through these tech frontiers, it's clear that the landscape is ever-changing, offering us endless chances to learn, grow, and maybe even create some coding magic ourselves. Stay curious, stay nerdy, and keep coding those dreams into reality! 🌟

If you have some more time, here are some interesting links from the past week’s readings:

Mean Reversion: When Will Startup Investing Return to Normal?
Beating GPT-4 on HumanEval with a Fine-Tuned CodeLlama-34B. Phind has fine-tuned CodeLlama-34B and CodeLlama-34B-Python on an internal Phind dataset that achieved 67.6% and 69.5% pass@1 on HumanEval, respectively. GPT-4 achieved 67% according to their official technical report in March.
When tech says no: The tech industry always has a reason why any new laws or regulations are bad - indeed, so does any industry. They always say that! The trouble is, sometimes it’s true, and some laws are (or would be) disasters. So which is it? Well, there are three ways that people say ‘NO!’
Nat Friedman’s 10 bullets on why efficient market hypothesis is a lie and cultural prohibition of micromanagement is harmful
There is now strong evidence that AI can help make us more innovative.
From Spotify: Encouragement Designs and Instrumental Variables for A/B Testing. The core idea of an encouragement design is to assign the treatment to the entire population that we’re testing on, but randomize the encouragement to use the feature.
If you want to address tech debt, quantify it first
The reimagined YC library
Longread from Martin Fowler - Surging cloud and managed services costs outpacing customer growth
Discourses on software development and the role of engineering

I think software engineering will spawn a new subdiscipline, specializing in applications of AI and wielding the emerging stack effectively, just as “site reliability engineer”, “devops engineer”, “data engineer” and “analytics engineer” emerged.The emerging (and least cringe) version of this role seems to be: AI Engineer.

We have come a long way on Text2Voice. PlayHT2.0 is a leap in the field of Speech Synthesis based on an advanced neural network model, akin to the transformer-based methods used by OpenAI in models like DALLE-2 - yet, uniquely catered to the realm of audio. Truly a SOTA Text2Voice for Conversational Speech.

Thank you for reading What Matters?. This post is public so feel free to share it.

miscellaneous folder

Exploring Code Quality, LLM Trends, and Metrics in the Tech Sphere