This is interesting: Talkie is a vintage LLM, trained on “historical pre-1931 English text”. “The training data for the base model is entirely out of copyright (the USA copyright cutoff date is currently January 1, 1931).”
This site is made possible by member support. 💞
Big thanks to Arcustech for hosting the site and offering amazing tech support.
When you buy through links on kottke.org, I may earn an affiliate commission. Thanks for supporting the site!
kottke.org. home of fine hypertext products since 1998.
Beloved by 86.47% of the web.
This is interesting: Talkie is a vintage LLM, trained on “historical pre-1931 English text”. “The training data for the base model is entirely out of copyright (the USA copyright cutoff date is currently January 1, 1931).”
Comments 4
Robin Sloan:
Meg Conley:
It is also fascinating that the model knows information up to 1931, but, at least in some science topics, seems very stuck in the early 1900s. For example, it defends the lumiferous aether hypothesis & has a distrust of special relativity
https://cdn.bsky.app/img/feed_fullsize/plain/did:plc:flxq4uyjfotciovpw3x3fxnu/bafkreifmgg2w4zuq7wosdsv7h7pf7txevlf3pp7bl2giwoukizrvhq6cqu
I’m kind of not surprised. We tend to think of scientific advances as sudden, but there are always a bunch of people unwilling to hop on board. Remember Einstein never won a Nobel Prize for either special or general relativity. The LLMs are just reflecting how people talked (as they do).
https://bsky.app/profile/emollick.bsky.social/post/3mkjeex5gn22p
If you feel like this comment goes against the grain of the community guidelines or is otherwise inappropriate, please let me know and I will take a look at it.
This thread is closed for new comments & replies. Thanks to everyone for participating!