Here's my end-of-year review of things we learned about LLMs in 2024 - we learned a LOT of things https://simonwillison.net/2024/Dec/31/llms-in-2024/
Table of contents:
Here's my end-of-year review of things we learned about LLMs in 2024 - we learned a LOT of things https://simonwillison.net/2024/Dec/31/llms-in-2024/
Table of contents:
@iveyline I really hope not. I like LLMs that augment human abilities - that give us new tools. That's one of the reasons I'm unexcited about the idea of "AGI" - that sounds like a human-replacement play to me, which doesn't interest me at all.
Video scraping: extracting JSON data from a 35 second screen capture for less than 1/10th of a cent https://simonwillison.net/2024/Oct/17/video-scraping/
I needed to extract information from a dozen emails in my inbox... so I ran a screen capture tool, clicked through each of them in turn and then got Google's Gemini 1.5 Flash multi-modal LLM to extract (correct, I checked it) JSON data from that 35 second video.
Total cost for 11,018 tokens: $0.00082635
It turns out Google Chrome ships a default, hidden extension that allows code on `*.google.com` access to private APIs, including your current CPU usage
You can test it out by pasting the following into your Chrome DevTools console on any Google page:
chrome.runtime.sendMessage(
"nkeimhogjdpnpccoofpliimaahmaaome",
{ method: "cpu.getInfo" },
(response) => {
console.log(JSON.stringify(response, null, 2));
},
);
More notes here: https://simonwillison.net/2024/Jul/9/hangout_servicesthunkjs/
Several of the major social media platforms - Instagram, TikTok, LinkedIn, Twitter - have effectively declared war on linking to things and I absolutely hate it
"Link in my bio" / "Link in thread" / "Link in first comment"... or increasingly no link at all, just an unsourced screenshot of a page
@feditips @Snowshadow @mhoye I think it's this bug here https://github.com/mastodon/mastodon/issues/24676
@feditips @Snowshadow @mhoye that's the bug: it's definitely a thread!
You can see that it's a thread if you follow this link to the second post in that thread: https://mastodon.social/@mhoye/111336017090790537
OK, I have a somewhat baffling (to me) Mastodon question. How do I link to a thread?
I want to link to a fantastic thread by @mhoye - but if I link to the first post in that thread - https://mastodon.social/@mhoye/111335603309582734 - I get a page with a single post on it and other people's replies, with no indication it's part of a larger thread from the same author
Am I missing something here?
Question for people who understand how US non-profits work - how normal is it to spend $247,000 on "CEO outsourced services" as an independent contractor?
Just poking around in https://projects.propublica.org/nonprofits/organizations/133444882
We accidentally invented computers that can lie to us and we can't figure out how to make them stop
(If you don't think it's possible for a computer to deliberately lie, take a look at "sycophancy" and "sandbagging" in the field of large language models! https://simonwillison.net/2023/Apr/5/sycophancy-sandbagging/ )
Here's my latest weirdly specific GPT-4 enhanced project: we wanted to measure the temperature of a microwave Raku kiln (yes, that's a thing - talk to @natbat about it) over time without tediously watching the thermometer for hours... so instead we recorded a video of the thermometer then used ffmpeg and Google Cloud Vision to OCR readings from it into a database https://til.simonwillison.net/googlecloud/video-frame-ocr
I expect GPT-4 will have a LOT of applications in web scraping
The increased 32,000 token limit will be large enough to send it the full DOM of most pages, serialized to HTML - then ask questions to extract data
Or... take a screenshot and use the GPT4 image input mode to ask questions about the visually rendered page instead!
Might need to dust off all of those old semantic web dreams, because the world's information is rapidly becoming fully machine readable
@matrix I think both!
@matrix Trying that now and the results are pretty extraordinary, it seems to be able to write Datasette plugins from scratch
Just got access to GPT4 via ChatGPT and it's doing shockingly (and creepily) well on my test of "Who is X?" - here's its answers for Simon Willison and for Andy Baio
Both of them appear to be entirely accurate - in the past I've seen all sorts of wild hallucinations from this kind of prompt
There are a LOT of screenshots of the current Bing floating around right now where it answers questions with hilariously bad answers. This is NOT the new Bing though: this is Bing's existing version of Google's "featured snippets"
The new Bing is still behind a waitlist for most people. I've attached a screenshot of that taken from this Verge article: https://www.theverge.com/2023/2/7/23587454/microsoft-bing-edge-chatgpt-ai
If you see a screenshot like this one you can dunk on it all you like but it's NOT the new GPT-3 enhanced Bing: this is something a Bing has been doing poorly for a long time in its existing form
The best screenshots I've seen of the new Bing chat interface so far are in this Reddit gallery, where the bot genuinely ends up trying to passive aggressively gaslight the user into believing that it's still 2022 https://www.reddit.com/r/bing/comments/110eagl/the_customer_service_of_the_new_bing_chat_is/
(I really hope I can get access to this thing before they fix its personality to not be so weird and rude and argumentative)
Open source developer building tools to help journalists, archivists, librarians and others analyze, explore and publish their data. https://datasette.io and many other #projects.
076萌SNS is a social network, courtesy of 076. It runs on GNU social, version 2.0.2-beta0, available under the GNU Affero General Public License.
All 076萌SNS content and data are available under the Creative Commons Attribution 3.0 license.