homeaboutarchivepodcastnewslettermembership!
aboutarchivepodcastmembership!
aboutarchivemembers!

kottke.org posts about artificial intelligence

Ted Chiang: Fears of Technology Are Fears of Capitalism

posted by Jason Kottke   Apr 02, 2021

Writer Ted Chiang (author of the fantastic Exhalation) was recently a guest on the Ezra Klein Show. The conversation ranged widely — I enjoyed his thoughts on superheroes — but his comments on capitalism and technology seem particularly relevant right now. From the transcript:

I tend to think that most fears about A.I. are best understood as fears about capitalism. And I think that this is actually true of most fears of technology, too. Most of our fears or anxieties about technology are best understood as fears or anxiety about how capitalism will use technology against us. And technology and capitalism have been so closely intertwined that it’s hard to distinguish the two.

Let’s think about it this way. How much would we fear any technology, whether A.I. or some other technology, how much would you fear it if we lived in a world that was a lot like Denmark or if the entire world was run sort of on the principles of one of the Scandinavian countries? There’s universal health care. Everyone has child care, free college maybe. And maybe there’s some version of universal basic income there.

Now if the entire world operates according to — is run on those principles, how much do you worry about a new technology then? I think much, much less than we do now. Most of the things that we worry about under the mode of capitalism that the U.S practices, that is going to put people out of work, that is going to make people’s lives harder, because corporations will see it as a way to increase their profits and reduce their costs. It’s not intrinsic to that technology. It’s not that technology fundamentally is about putting people out of work.

It’s capitalism that wants to reduce costs and reduce costs by laying people off. It’s not that like all technology suddenly becomes benign in this world. But it’s like, in a world where we have really strong social safety nets, then you could maybe actually evaluate sort of the pros and cons of technology as a technology, as opposed to seeing it through how capitalism is going to use it against us. How are giant corporations going to use this to increase their profits at our expense?

And so, I feel like that is kind of the unexamined assumption in a lot of discussions about the inevitability of technological change and technologically-induced unemployment. Those are fundamentally about capitalism and the fact that we are sort of unable to question capitalism. We take it as an assumption that it will always exist and that we will never escape it. And that’s sort of the background radiation that we are all having to live with. But yeah, I’d like us to be able to separate an evaluation of the merits and drawbacks of technology from the framework of capitalism.

Echoing some of his other thoughts during the podcast, Chiang also wrote a piece for the New Yorker the other day about how the singularity will probably never come.

How Do Algorithms Become Biased?

posted by Jason Kottke   Mar 31, 2021

In the latest episode of the Vox series Glad You Asked, host Joss Fong looks at how racial and other kinds of bias are introduced into massive computer systems and algorithms, particularly those that work through machine learning, that we use every day.

Many of us assume that tech is neutral, and we have turned to tech as a way to root out racism, sexism, or other “isms” plaguing human decision-making. But as data-driven systems become a bigger and bigger part of our lives, we also notice more and more when they fail, and, more importantly, that they don’t fail on everyone equally. Glad You Asked host Joss Fong wants to know: Why do we think tech is neutral? How do algorithms become biased? And how can we fix these algorithms before they cause harm?

Full? Self? Driving? Hmmm…

posted by Jason Kottke   Mar 18, 2021

How is Tesla’s full-self driving system coming along? Perhaps not so good. YouTuber AI Addict took the company’s FSD Beta 8.2 for a drive through downtown Oakland recently and encountered all sorts of difficulties. The video’s chapter names should give you some idea: Crosses Solid Lines, Acting Drunk, Right Turn In Wrong Lane, Wrong Way!!!, Near Collision (1), and Near Collision (2). They did videos of drives in SF and San Jose as well.

I realize this is a beta, but it’s a beta being tested by consumers on actual public roads. While I’m sure it works great on their immaculate test track, when irregularities in your beta can easily result in the death or grave injury of a pedestrian, cyclist, or other motorist several times over the course of 30 minutes, how can you consider it safe to release to the public in any way? It seems like Level 5 autonomy is going to be difficult to manage under certain road conditions. (via @TaylorOgan)

BirdCast: Real-Time Bird Migration Forecasts

posted by Jason Kottke   Mar 09, 2021

Birdcast

Colorado State University and the Cornell Lab of Ornithology have developed a system called BirdCast that uses machine learning & two decades of historical bird movement data to develop daily bird migration forecasts for the United States.

Bird migration forecasts show predicted nocturnal migration 3 hours after local sunset and are updated every 6 hours. These forecasts come from models trained on the last 23 years of bird movements in the atmosphere as detected by the US NEXRAD weather surveillance radar network. In these models we use the Global Forecasting System (GFS) to predict suitable conditions for migration occurring three hours after local sunset.

The map above is the migration forecast for tonight — overall, warmer temperatures and increased bird movement are predicted for the next week or two. They also maintain up-to-the hour records of migration activity detected by the US weather surveillance radar network; this was the activity early this morning at 3:10am ET:

Birdcast

If the current & predicted bird radar maps were a part of the weather report on the local news, I might start watching again.

GANksy - an AI Street Artist that Emulates Banksy

posted by Jason Kottke   Jan 05, 2021

street art made by an AI

street art made by an AI

street art made by an AI

GANksy is an AI program trained on Banksy’s street art.

GANksy was born into the cloud in September 2020, then underwent a strenuous A.I. training regime using hundreds of street art photos for thousands of iterations to become the fully-formed artist we see today. All of GANksy’s works are original creations derived from its understanding of shape, form and texture. GANksy wants to be put into a robot body so it can spraypaint the entire planet.

The results are cool but not super coherent — these look more like abstract NIN and Radiohead album covers than the sly & whimsical works Banksy stencils up around the world. With GANksy, you get the feel of Banksy’s art and the surfaces he chooses to put it on but little of the meaning, which is about what you would expect from training using a neural network based on style.

AlphaGo - The Movie

posted by Jason Kottke   Nov 10, 2020

I missed this back in March (I think there was a lot going on back then?) but the feature-length documentary AlphaGo is now available to stream for free on YouTube. The movie documents the development by DeepMind/Google of the AlphaGo computer program designed to play Go and the competition between AlphaGo and Lee Sedol, a Go master.

With more board configurations than there are atoms in the universe, the ancient Chinese game of Go has long been considered a grand challenge for artificial intelligence. On March 9, 2016, the worlds of Go and artificial intelligence collided in South Korea for an extraordinary best-of-five-game competition, coined The DeepMind Challenge Match. Hundreds of millions of people around the world watched as a legendary Go master took on an unproven AI challenger for the first time in history.

During the competition back in 2016, I wrote a post that rounded up some of the commentary about the matches.

Move after move was exchanged and it became apparent that Lee wasn’t gaining enough profit from his attack.

By move 32, it was unclear who was attacking whom, and by 48 Lee was desperately fending off White’s powerful counter-attack.

I can only speak for myself here, but as I watched the game unfold and the realization of what was happening dawned on me, I felt physically unwell.

The AI Who Mistook a Bald Head for a Soccer Ball

posted by Jason Kottke   Nov 02, 2020

Second-tier Scottish football club Inverness Caledonian Thistle doesn’t have a camera operator for matches at their stadium so the club uses an AI-controlled camera that’s programmed to follow the ball for their broadcasts. But in a recent match against Ayr United, the AI controller kept moving the camera off the ball to focus on the bald head of the linesman, making the match all but unwatchable. No fans allowed in the stadium either, so the broadcast was the only way to watch.

“Reverse Toonification” of Pixar Characters

posted by Jason Kottke   Oct 21, 2020

Using an AI-based framework called Pixel2Style2Pixel and searching for faces in a dataset harvested from Flickr, Nathan Shipley made some more photorealistic faces for Pixar characters.

reverse toonification of Pixar characters

reverse toonification of Pixar characters

reverse toonification of Pixar characters

In response to a reader suggestion, Shipley fed the generated image for Dash back into the system and this happened:

reverse toonification of Pixar characters

I cannot tell where these images should live in the uncanny valley. You can see some similar experiments from Shipley here: a more realistic version of Miles from Spider-Verse, images of Frida Kahlo and Diego Rivera “reverse engineered” from paintings, and an image generated from a Rembrandt self-portrait.

A.I. Claudius

posted by Jason Kottke   Aug 17, 2020

Roman Emperors Photos

Roman Emperors Photos

Roman Emperors Photos

For his Roman Emperor Project, Daniel Voshart (whose day job includes making VR sets for Star Trek: Discovery) used a neural-net tool and images of 800 sculptures to create photorealistic portraits of every Roman emperor from 27 BCE to 285 ACE. From the introduction to the project:

Artistic interpretations are, by their nature, more art than science but I’ve made an effort to cross-reference their appearance (hair, eyes, ethnicity etc.) to historical texts and coinage. I’ve striven to age them according to the year of death — their appearance prior to any major illness.

My goal was not to romanticize emperors or make them seem heroic. In choosing bust / sculptures, my approach was to favor the bust that was made when the emperor was alive. Otherwise, I favored the bust made with the greatest craftsmanship and where the emperor was stereotypically uglier — my pet theory being that artists were likely trying to flatter their subjects.

Some emperors (latter dynasties, short reigns) did not have surviving busts. For this, I researched multiple coin depictions, family tree and birthplaces. Sometimes I created my own composites.

You can buy a print featuring the likenesses of all 54 emperors on Etsy.

See also Hand-Sculpted Archaeological Reconstructions of Ancient Faces and The Myth of Whiteness in Classical Sculpture.

Audio Deepfakes Result in Some Pretty Convincing Mashup Performances

posted by Jason Kottke   Apr 30, 2020

Have you ever wanted to hear Jay Z rap the “To Be, Or Not To Be” soliloquy from Hamlet? You are in luck:

What about Bob Dylan singing Britney Spears’ “…Baby One More Time”? Here you go:

Bill Clinton reciting “Baby Got Back” by Sir Mix-A-Lot? Yep:

And I know you’re always wanted to hear six US Presidents rap NWA’s “Fuck Tha Police”. Voila:

This version with the backing track is even better. These audio deepfakes were created using AI:

The voices in this video were entirely computer-generated using a text-to-speech model trained on the speech patterns of Barack Obama, Ronald Reagan, John F. Kennedy, Franklin Roosevelt, Bill Clinton, and Donald Trump.

The program listens to a bunch of speech spoken by someone and then, in theory, you can provide any text you want and the virtual Obama or Jay Z can speak it. Some of these are more convincing than others — with a bit of manual tinkering, I bet you could clean these up enough to make them convincing.

Two of the videos featuring Jay Z’s synthesized voice were forced offline by a copyright claim from his record company but were reinstated. As Andy Baio notes, these deepfakes are legally interesting:

With these takedowns, Roc Nation is making two claims:

1. These videos are an infringing use of Jay-Z’s copyright.
2. The videos “unlawfully uses an AI to impersonate our client’s voice.”

But are either of these true? With a technology this new, we’re in untested legal waters.

The Vocal Synthesis audio clips were created by training a model with a large corpus of audio samples and text transcriptions. In this case, he fed Jay-Z songs and lyrics into Tacotron 2, a neural network architecture developed by Google.

It seems reasonable to assume that a model and audio generated from copyrighted audio recordings would be considered derivative works.

But is it copyright infringement? Like virtually everything in the world of copyright, it depends-on how it was used, and for what purpose.

Celebrity impressions by people are allowed, why not ones by machines? It’ll be interesting to see where this goes as the tech gets better.

Deepfake Video of Robert Downey Jr. and Tom Holland in Back to the Future

posted by Jason Kottke   Feb 19, 2020

This deepfake video of Back to the Future that features Robert Downey Jr. & Tom Holland as Doc Brown & Marty McFly is so convincing that I almost want to see an actual remake with those actors. (Almost.)

They really should have deepfaked Zendaya into the video as Lorraine for the cherry on top. Here’s an earlier effort with Holland as Marty that’s not as good.

Billie Eilish Interviewed by AI Bot

posted by Jason Kottke   Feb 14, 2020

Collaborating with the team at Conde Nast Entertainment and Vogue, my pal Nicole He trained an AI program to interview music superstar Billie Eilish. Here are a few of the questions:

Who consumed so much of your power in one go?
How much of the world is out of date?
Have you ever seen the ending?

This is a little bit brilliant. The questions are childlike in a way, like something a bright five-year-old would ask a grownup, perceptive and nonsensical (or even Dr. Seussical) at the same time. As He says:

What I really loved hearing Billie say was that human interviewers often ask the same questions over and over, and she appreciated that the AI questions don’t have an agenda in the same way, they’re not trying to get anything from her.

I wonder if there’s something that human interviewers can learn from AI-generated questions — maybe using them as a jumping off point for their own questions or asking more surprising or abstract questions or adapting the mentality of the childlike mind.

See also Watching Teen Superstar Billie Eilish Growing Up.

A Machine Dreams Up New Insect Species

posted by Jason Kottke   Jan 03, 2020

Using a book of insect illustrations from the 1890s, Bernat Cuni used a variety of machine learning tools to generate a bunch of realistic-looking beetles that don’t actually exist in nature.

Prints are available.

A Deepfake Nixon Delivers Eulogy for the Apollo 11 Astronauts

posted by Jason Kottke   Nov 26, 2019

When Neil Armstrong and Buzz Aldrin landed safely on the Moon in July 1969, President Richard Nixon called them from the White House during their moonwalk to say how proud he was of what they had accomplished. But in the event that Armstrong and Aldrin did not make it safely off the Moon’s surface, Nixon was prepared to give a very different sort of speech. The remarks were written by William Safire and recorded in a memo called In Event of Moon Disaster.

Fifty years ago, not even Stanley Kubrick could have faked the Moon landing. But today, visual effects and techniques driven by machine learning are so good that it might be relatively simple, at least the television broadcast part of it.1 In a short demonstration of that technical supremacy, a group from MIT has created a deepfake version of Nixon delivering that disaster speech. Here are a couple of clips from the deepfake speech:

Fate has ordained that the men who went to the moon to explore in peace will stay on the moon to rest in peace.

The full film is being shown at IDFA DocLab in Amsterdam and will make its way online sometime next year.

The implications of being able to so convincingly fake the televised appearance of a former US President are left as an exercise to the reader. (via boing boing)

Update: The whole film is now online. (thx, andy)

  1. But technology is often a two-way street. If the resolution of the broadcast is high enough, CGI probably still has tells…and AI definitely does. And even if you got the TV broadcast correct, with the availability of all sorts of high-tech equipment, the backyard astronomer, with the collective help of their web-connected compatriots around the world, would probably be able to easily sniff out whether actual spacecraft and communication signals were in transit to and from the Moon.

Can You Copyright Work Made by an Artificial Intelligence?

posted by Jason Kottke   Nov 26, 2019

In a recent issue of Why is this interesting?, Noah Brier collects a number of perspectives on whether (and by whom) a work created by an artificial intelligence can be copyrighted.

But as I dug in a much bigger question emerged: Can you actually copyright work produced by AI? Traditionally, the law has been that only work created by people can receive copyright. You might remember the monkey selfie copyright claim from a few years back. In that case, a photographer gave his camera to a monkey who then snapped a selfie. The photographer then tried to claim ownership and PETA sued him to try to claim it back for the monkey. In the end, the photograph was judged to be in the public domain, since copyright requires human involvement. Machines, like monkeys, can’t own work, but clearly something made with the help of a human still qualifies for copyright. The question, then, is where do we draw the line?

Astrology and Wishful Thinking

posted by Jason Kottke   Nov 14, 2019

In the Guardian, former astrologer Felicity Carter writes about how fortune telling really works and why she had to quit.

I also learned that intelligence and education do not protect against superstition. Many customers were stockbrokers, advertising executives or politicians, dealing with issues whose outcomes couldn’t be controlled. It’s uncertainty that drives people into woo, not stupidity, so I’m not surprised millennials are into astrology. They grew up with Harry Potter and graduated into a precarious economy, making them the ideal customers.

What broke the spell for me was, oddly, people swearing by my gift. Some repeat customers claimed I’d made very specific predictions, of a kind I never made. It dawned on me that my readings were a co-creation — I would weave a story and, later, the customer’s memory would add new elements. I got to test this theory after a friend raved about a reading she’d had, full of astonishingly accurate predictions. She had a tape of the session, so I asked her to play it.

The clairvoyant had said none of the things my friend claimed. Not a single one. My friend’s imagination had done all the work.

The last paragraph, on VC-funded astrology apps, was particularly interesting. I’m reading Yuval Noah Harari’s 21 Lessons for the 21st Century right now and one of his main points is that AI + biotech will combine to produce an unprecedented revolution in human society.

For we are now at the confluence of two immense revolutions. Biologists are deciphering the mysteries of the human body, and in particular of the brain and human feelings. At the same time computer scientists are giving us unprecedented data-processing power. When the biotech revolution merges with the infotech revolution, it will produce Big Data algorithms that can monitor and understand my feelings much better than I can, and then authority will probably shift from humans to computers. My illusion of free will is likely to disintegrate as I daily encounter institutions, corporations, and government agencies that understand and manipulate what was until now my inaccessible inner realm.

I hadn’t thought that astrology apps could be a major pathway to AI’s control of humanity, but Carter’s assertion makes sense.

Machine Hallucination

posted by Jason Kottke   Sep 23, 2019

Machine Hallucination

After seeing some videos on my pal Jenni’s Instagram of Refik Anadol’s immersive display at ARTECHOUSE in NYC, it’s now at the top of my list of things to see the next time I’m in NYC.

Machine Hallucination, Anadol’s first large-scale installation in New York City is a mixed reality experiment deploying machine learning algorithms on a dataset of over 300 million images — representing a wide-ranging selection of architectural styles and movements — to reveal the hidden connections between these moments in architectural history. As the machine generates a data universe of architectural hallucinations in 1025 dimensions, we can begin to intuitively understand the ways that memory can be spatially experienced and the power of machine intelligence to both simultaneously access and augment our human senses.

Here’s a video of Anadol explaining his process and a little bit about Machine Hallucination. Check out some reviews at Designboom, Gothamist, and Art in America and watch some video of the installation here.

Pixar’s AI Spiders

posted by Jason Kottke   Sep 06, 2019

As I mentioned in a post about my west coast roadtrip, one of the things I heard about during my visit to Pixar was their AI spiders. For Toy Story 4, the production team wanted to add some dusty ambiance to the antique store in the form of cobwebs.

Toy Story Cobwebs

Rather than having to painstakingly create the webs by hand as they’d done in the past, technical director Hosuk Chang created a swarm of AI spiders that could weave the webs just like a real spider would.

We actually saw the AI spiders in action and it was jaw-dropping to see something so simple, yet so technically amazing to create realistic backgrounds elements like cobwebs. The spiders appeared as red dots that would weave their way between two wood elements just like a real spider would.

All the animators had to do is tell the spiders where the cobwebs needed to be.

“He guided the spiders to where he wanted them to build cobwebs, and they’d do the job for us. And when you see those cobwebs overlaid on the rest of the scene, it gives the audience the sense that this place has been here for a while.” Without that program, animators would have had to make the webs one strand at a time, which would have taken several months. “You have to tell the spider where the connection points of the cobweb should go,” Jordan says, “but then it does the rest.”

Chang and his colleague David Luoh presented a paper about the spiders (and dust) at SIGGRAPH ‘19 in late July (which is unfortunately behind a paywall).

VFX Breakdown of Ctrl Shift Face’s Ultra-Realistic Deepfakes

posted by Jason Kottke   Aug 27, 2019

Ctrl Shift Face created the popular deepfake videos of Bill Hader impersonating Arnold Schwarzenegger, Hader doing Tom Cruise, and Jim Carrey in The Shining. For their latest video, they edited Freddie Mercury’s face onto Rami Malek1 acting in a scene from Mr. Robot:

And for the first time, they shared a short visual effects breakdown of how these deepfakes are made:

Mercury/Malek says in the scene: “Even I’m not crazy enough to believe that distortion of reality.” Ctrl Shift Face is making it difficult to believe these deepfakes aren’t real.

  1. I had dinner next to Malek at the bar in a restaurant in the West Village a few months ago, pre-Oscar. I didn’t notice who it was when he sat down but as soon as he opened his mouth, I knew it was him — that unmistakable voice. Several people came by to say hello, buy him drinks, etc. and he and his friends were super gracious to everyone, staff included. I’ve added him to my list of actors who are actually nice alongside Tom Hanks and Keanu Reeves.

Photo Wake-Up

posted by Jason Kottke   Jun 19, 2019

Photo Wake Up

Researchers at the University of Washington and Facebook have developed an algorithm that can “wake up” people depicted in still images (photos, drawings, paintings) and create 3D characters than can “walk out” of their images. Check out some examples and their methods here (full paper):

The AR implementation of their technique is especially impressive…a figure in a Picasso painting just comes alive and starts running around the room. (thx nick, who accurately notes the Young Sherlock Holmes vibe)

Deepfakes: Imagine All the People

posted by Jason Kottke   Jun 13, 2019

Here is a video of Donald Trump, Vladimir Putin, Barack Obama, Kim Jong Un, and other world leaders lip-syncing along to John Lennon’s Imagine:

Of course this isn’t real. The video was done by a company called Canny AI, which offers services like “replace the dialogue in any footage” and “lip-sync your dubbed content in any language”. That’s cool and all — picture episodes of Game of Thrones or Fleabag where the actors automagically lip-sync along to dubbed French or Chinese — but this technique can also be used to easily create what are referred to as deepfakes, videos made using AI techniques in which people convincingly say and do things they actually did not do or say. Like this video of Mark Zuckerberg finally telling the truth about Facebook. Or this seriously weird Steve Buscemi / Jennifer Lawrence mashup:

Or Bill Hader’s face morphing into Arnold Schwarzenegger’s face every time he impersonates him:

What should we do about these kinds of videos? Social media sites have been removing some videos intended to mislead or confuse people, but notably Facebook has refused to take the Zuckerberg video down (as well as a slowed-down video of Nancy Pelosi in which she appears drunk). Congress is moving ahead with a hearing on deepfakes and the introduction of a related bill:

The draft bill, a product of several months of discussion with computer scientists, disinformation experts, and human rights advocates, will include three provisions. The first would require companies and researchers who create tools that can be used to make deepfakes to automatically add watermarks to forged creations.

The second would require social-media companies to build better manipulation detection directly into their platforms. Finally, the third provision would create sanctions, like fines or even jail time, to punish offenders for creating malicious deepfakes that harm individuals or threaten national security. In particular, it would attempt to introduce a new mechanism for legal recourse if people’s reputations are damaged by synthetic media.

I’m hopeful this bill will crack down on the malicious use of deepfakes and other manipulated videos but leave ample room for delightful art and culture hacking like the Hader/Schwarzenegger thing or one of my all-time favorite videos, a slowed-down Jeff Goldblum extolling the virtues of the internet in an Apple ad:

“Internet? I’d say internet!”

Update: Here’s another Bill Hader deepfake, with his impressions of Tom Cruise and Seth Rogen augmented by his face being replaced by theirs.

Pattern Radio: Whale Songs

posted by Jason Kottke   Jun 10, 2019

The National Oceanic and Atmospheric Administration (NOAA) and Google have teamed up on a project to identify the songs of humpback whales from thousands of hours of audio using AI. The AI proved to be quite good at detecting whale sounds and the team has put the files online for people to listen to at Pattern Radio: Whale Songs. Here’s a video about the project:

You can literally browse through more than a year’s worth of underwater recordings as fast as you can swipe and scroll. You can zoom all the way in to see individual sounds — not only humpback calls, but ships, fish and even unknown noises. And you can zoom all the way out to see months of sound at a time. An AI heat map guides you to where the whale calls most likely are, while highlight bars help you see repetitions and patterns of the sounds within the songs.

The audio interface is cool — you can zoom in and out of the audio wave patterns to see the different rhythms of communication. I’ve had the audio playing in the background for the past hour while I’ve been working…very relaxing.

Teaching a Neural Network How to Drive a Car

posted by Jason Kottke   Jun 06, 2019

In this video, you can watch a simple neural network learn how to navigate a video game race track. The program doesn’t know how to turn at first, but the car that got the furthest in the first race (out of 650 competitors) is then used as the seed for the next generation. The winning cars from each generation are used to seed the next race until a few of them make it all the way around the track in just the 4th generation.

I think one of the reason I find neural network training so fascinating is that you can observe, in a very simple and understandable way, the basic method by which all life on Earth evolved the ability to do things like move, see, swim, digest food, echolocate, grasp objects, and use tools. (via dunstan)

This AI Converts Quick Sketches to Photorealistic Landscapes

posted by Jason Kottke   Mar 27, 2019

NVIDIA has been doing lots of interesting things with deep learning algorithms lately (like AI-Generated Human Faces That Look Amazingly Real). Their most recent effort is the development and training of a program that takes rough sketches and converts them into realistic images.

A novice painter might set brush to canvas aiming to create a stunning sunset landscape — craggy, snow-covered peaks reflected in a glassy lake — only to end up with something that looks more like a multi-colored inkblot.

But a deep learning model developed by NVIDIA Research can do just the opposite: it turns rough doodles into photorealistic masterpieces with breathtaking ease. The tool leverages generative adversarial networks, or GANs, to convert segmentation maps into lifelike images.

Here’s a post I did 10 years ago that shows how far sketch-to-photo technology has come.

AI Algorithm Can Detect Alzheimer’s Earlier Than Doctors

posted by Jason Kottke   Jan 07, 2019

A machine learning algorithm programmed by Dr. Jae Ho Sohn can look at PET scans of human brains and spot indicators of Alzheimer’s disease with a high level of accuracy an average of 6 years before the patients would receive a final clinical diagnosis from a doctor.

To train the algorithm, Sohn fed it images from the Alzheimer’s Disease Neuroimaging Initiative (ADNI), a massive public dataset of PET scans from patients who were eventually diagnosed with either Alzheimer’s disease, mild cognitive impairment or no disorder. Eventually, the algorithm began to learn on its own which features are important for predicting the diagnosis of Alzheimer’s disease and which are not.

Once the algorithm was trained on 1,921 scans, the scientists tested it on two novel datasets to evaluate its performance. The first were 188 images that came from the same ADNI database but had not been presented to the algorithm yet. The second was an entirely novel set of scans from 40 patients who had presented to the UCSF Memory and Aging Center with possible cognitive impairment.

The algorithm performed with flying colors. It correctly identified 92 percent of patients who developed Alzheimer’s disease in the first test set and 98 percent in the second test set. What’s more, it made these correct predictions on average 75.8 months — a little more than six years — before the patient received their final diagnosis.

This is the stuff where AI is going to be totally useful…provided the programs aren’t cheating somehow.

AI-Generated Human Faces That Look Amazingly Real

posted by Jason Kottke   Dec 27, 2018

The opening line of Madeline Miller’s Circe is: “When I was born, the name for what I was did not exist.” In Miller’s telling of the mythological story, Circe was the daughter of a Titan and a sea nymph (a lesser deity born of two Titans). Yes, she was an immortal deity but lacked the powers and bearing of a god or a nymph, making her seem unnervingly human. Not knowing what to make of her and for their own safety, the Titans and Olympic gods agreed to banish her forever to an island.

Here’s a photograph of a woman who could also claim “when I was born, the name for what I was did not exist”:

AI Faces

The previous line contains two lies: this is not a photograph and that’s not a real person. It’s an image generated by an AI program developed by researchers at NVIDIA capable of borrowing styles from two actual photographs of real people to produce an infinite number of fake but human-like & photograph-like images.

AI Faces

We propose an alternative generator architecture for generative adversarial networks, borrowing from style transfer literature. The new architecture leads to an automatically learned, unsupervised separation of high-level attributes (e.g., pose and identity when trained on human faces) and stochastic variation in the generated images (e.g., freckles, hair), and it enables intuitive, scale-specific control of the synthesis.

The video offers a good look at how this works, with realistic facial features that you can change with a slider, like adjusting the volume on your stereo.

Photographs that aren’t photographs and people that aren’t people, born of a self-learning machine developed by humans. We’ll want to trust these images because they look so real, especially once they start moving and talking. I wonder…will we soon seek to banish them for our own safety as the gods banished Circe?

Update: This Person Does Not Exist is a single serving site that provides a new portrait of a non-existent person with each reload.

Remastered Film Footage of 1890s Paris

posted by Jason Kottke   Dec 17, 2018

The Lumière brothers were among the first filmmakers in history and from 1896 to 1900, they shot several scenes around Paris. Guy Jones remastered the Lumière’s Paris footage, stabilized it, slowed it down to a natural rate, and added some Foley sound effects. As Paris today looks very similar to how it did then, it’s easy to pick out many of the locations seen in this short compilation: the Tuileries, the Notre-Dame, Place de la Concorde, and of course the Eiffel Tower, which was completed only 8 years before filming. Here’s the full location listing:

0:08 - Notre-Dame Cathedral (1896)
0:58 - Alma Bridge (1900)
1:37 - Avenue des Champs-Élysées (1899)
2:33 - Place de la Concorde (1897)
3:24 - Passing of a fire brigade (1897)
3:58 - Tuileries Garden (1896)
4:48 - Moving walkway at the Paris Exposition (1900)
5:24 - The Eiffel Tower from the Rives de la Seine à Paris (1897)

See also A Bunch of Early Color Photos of Paris, Peter Jackson’s documentary film featuring remastered film footage from World War I, and lots more film that Jones has remastered and uploaded. (via open culture)

Update: Just as he did with the NYC footage from 1911, Denis Shiryaev has used machine learning algorithms to restore the Lumières film of Paris — it’s been upsampled to 4K & 60 fps, sharpened, and colorized.

Again, there are some obvious artifacts and the colorization is distracting, but the result is impressive for push-button. (via open culture)

How AI Agents Cheat

posted by Jason Kottke   Nov 12, 2018

This spreadsheet lists a number of ways in which AI agents “cheat” in order to accomplish tasks or get higher scores instead of doing what their human programmers actually want them to. A few examples from the list:

Neural nets evolved to classify edible and poisonous mushrooms took advantage of the data being presented in alternating order, and didn’t actually learn any features of the input images.

In an artificial life simulation where survival required energy but giving birth had no energy cost, one species evolved a sedentary lifestyle that consisted mostly of mating in order to produce new children which could be eaten (or used as mates to produce more edible children).

Agent kills itself at the end of level 1 to avoid losing in level 2.

AI trained to classify skin lesions as potentially cancerous learns that lesions photographed next to a ruler are more likely to be malignant.

That second item is a doozy! Philosopher Nick Bostrom has warned of the dangers of superintelligent agents that exploit human error in programming them, describing a possible future where an innocent paperclip-making machine destroys the universe.

The “paperclip maximiser” is a thought experiment proposed by Nick Bostrom, a philosopher at Oxford University. Imagine an artificial intelligence, he says, which decides to amass as many paperclips as possible. It devotes all its energy to acquiring paperclips, and to improving itself so that it can get paperclips in new ways, while resisting any attempt to divert it from this goal. Eventually it “starts transforming first all of Earth and then increasing portions of space into paperclip manufacturing facilities”.

But some of this is The Lebowski Theorem of machine superintelligence in action. These agents didn’t necessarily hack their reward functions but they did take a far easiest path to their goals, e.g. the Tetris playing bot that “paused the game indefinitely to avoid losing”.

Update: A program that trained on a set of aerial photographs was asked to generate a map and then an aerial reconstruction of a previously unseen photograph. The reconstruction matched the photograph a little too closely…and it turned out that the program was hiding information about the photo in the map (kind of like in Magic Eye puzzles).

We claim that CycleGAN is learning an encoding scheme in which it “hides” information about the aerial photograph x within the generated map Fx. This strategy is not as surprising as it seems at first glance, since it is impossible for a CycleGAN model to learn a perfect one-to-one correspondence between aerial photographs and maps, when a single map can correspond to a vast number of aerial photos, differing for example in rooftop color or tree location.

There’s Waldo, an AI trained to find Waldo

posted by Jason Kottke   Aug 13, 2018

Add finding Waldo to the long list of things that machines can do better than humans. Creative agency Redpepper built a program that uses Google’s drag-and-drop machine learning service to find the eponymous character in the Where’s Waldo? series of books. After the AI finds a promising Waldo candidate, a robotic arm points to it on the page.

While only a prototype, the fastest There’s Waldo has pointed out a match has been 4.45 seconds which is better than most 5 year olds.

I know Skynet references are a little passé these days, but the plot of Terminator 2 is basically an intelligent machine playing Where’s Waldo I Want to Kill Him. We’re getting there!

Update: Prior art: Hey Waldo, HereIsWally, There’s Waldo!

This AI Makes High-Quality Slow Motion Videos from Regular 30 fps Video

posted by Jason Kottke   Jul 23, 2018

NVIDIA trained a deep learning framework to take videos filmed at 30 fps and turn them into slow motion videos at the equivalent of 240 or even 480 fps. Even though the system is guessing on the content in the extra frames, the final results look amazingly sharp and lifelike.

“There are many memorable moments in your life that you might want to record with a camera in slow-motion because they are hard to see clearly with your eyes: the first time a baby walks, a difficult skateboard trick, a dog catching a ball,” the researchers wrote in the research paper. “While it is possible to take 240-frame-per-second videos with a cell phone, recording everything at high frame rates is impractical, as it requires large memories and is power-intensive for mobile devices,” the team explained.

With this new research, users can slow down their recordings after taking them.

Using this technique and what Peter Jackson’s team is doing with WWI footage, it would be interesting to clean up and slow down all sorts of archival footage (like the Zapruder film, just to choose one obvious example).