Gurney Journey: Computer Graphics

Showing posts with label Computer Graphics. Show all posts

Sunday, September 4, 2022

Image-to-Image Style Transfer

Suppose you made this rough sketch and wanted to finish it in a photo-real or painterly style.

Twitter user @TomLikesRobots is a digital artist who did just that, using the new image generation tool called Stable Diffusion (#stablediffusion), the open-source model which has a powerful feature called Image to Image (#img2img).

He uploaded the sketch and told the system to render it as "A black and white photo of a young woman, studio lighting, realistic, Ilford HP5 400." The hair style is a little different, and the ear is weird, but the basic pose and lighting is pretty close to the sketch.

Tom says: "Overall, composition is controlled by the sketch and the details are controlled by the prompt.

This time he told it he wanted it to make the sketch look like "A portrait of a young woman by Norman Rockwell."

It looks like a painting, but not much like Rockwell—more like a Victorian artist like Charles S. Lidderdale. On close scrutiny it doesn't hold up too well: the blue ribbon and the costume don't make sense, and her left eye has too many eyelashes.

Here's the sketch rendered in the style of Gustav Klimt. It's got a lot of drawing problems that Klimt never would have allowed, but it is somewhat reminiscent of his style.

And here's the sketch as painted by Vincent Van Gogh. Again, we can pick it apart, but it's in the ballpark.

And finally Alfonse Mucha. He is one of the hardest to emulate. It used soft internal transitions and clear outline.

The system was also able to translate the sketch into the features of famous celebrities:

Selena Gomez

Scarlett Johansson

Nicole Kidman

Emma Stone

All of them have big problems with the hair, but they're recognizable, and let's admit the tech is in its infancy and will only get better.

If you're an artist, you might find all this a little scary, threatening, or astounding. I do too! I feel like we've been introduced to a magical genie who can bring whatever we wish into existence.

Some have responded by calling for laws to ban the technology from creating images in the styles of living, working artists. What if someone created a print for sale that was supposedly painted by you or me?

I think we have to be careful how we respond to this apparent threat. Let's keep in mind that the technology is transformative, meaning it doesn't copy/paste images. It creates something new. And artistic styles can't be—and shouldn't be—copyrightable. These tools are here to stay. They're only going to get more advanced, and they're open-source. Also the people who are developing it are artists, too.

I'm wary of laws or AI bots that restrict the growth of this new art form, or that drive the prompts underground. Instead of involving politicians and lawyers and AI bots, we should encourage a culture of mutual respect and fair play. Perhaps we should encourage generative artists to share their prompts when they use a living artist's name, and never to mislead their audience into thinking the work was actually painted by that artist. Basically, people should give credit where credit is due.

How do you think we should regulate or guide this new art industry?

Thursday, June 2, 2022

A Good Explainer on AI Art

I'm honored that Vox media asked me to be part of this video about AI-generated artwork.

If you don't want to watch the whole thing, I make a brief appearance at 10:25.

Producer Joss Fong and her team came up with the brilliant explanation on what actually happens with a deep learning model.

At 5:59 she explains multi-dimensionality with the example of yellow banana vs. red balloon. It's intriguing that we can't possibly know, in human terms, the criteria that the system is using to arrive at its results, or exactly what features it's extracting when a certain artist's name is used in the prompt.

There's a hidden bonus video that explores the reactions of various artists.

Please add your comments:

• How do you feel this technology will affect the business and practice of art that you do?
• Do you want to use these tools?
• Will they change what you do or how you do it?

I don't feel directly threatened by the tech, but I realize it will offer art buyers a cheap and fast method for generating editorial illustration, album cover art, and concept art. So it puts professional artists in those fields on notice and gives anyone the keys to becoming both an artist or an art director.

As an art watcher, I have a kind of morbid curiosity to see where the technology is headed next, and I dread the onslaught of cheap surrealism that is already flooding social media. Another thing I've noticed is that there's a shelf life to each new set of tools, just as there is for each new type of VFX technique. Each new set of tools becomes old hat, as

Monday, April 25, 2022

Robots with Flowers

How would you imagine a painting of a robot with flowers growing out of it?

"A happy robot with flowers growing out of his head, clouds in the background, digital art."

It's a whimsical idea that might make a fun concept for a children's book.

"A detailed painting of a rainbow colored robot with flowers growing out of its head."

Or it might be a promising pitch for an animated film.

"A Rene Magritte painting of a robot head with flowers growing out of the top with clouds in the background."

It could also be a theme for a group exhibition of surrealistic gallery art.

"A painting by Syd Mead of a bipedal robot with flowers growing out of the top of its head."

Designer Ben Barry used variations on this idea to generate over a thousand images in different styles. Mr. Barry is not an imagemaker in the usual sense. He is one of the lucky few who received beta access to the AI computer tool called Dall•E 2.

"A woodblock print of a bipedal robot with flowers growing out of the top of its head."

Mr. Barry came up with the instigating phrases or prompts, and Dall•E 2 did the rest, creating hundreds or even thousands of novel images in a few hours. The prompts sometimes used the names of dead artists to catalyze the results, but more often than not the prompts were just descriptive.

These are all high resolution images, adequate for magazine reproduction.

"A painting by Caravaggio of a robot head with flowers growing out of the top."

Mr. Barry edited a digital book that you can check out for free called 1000 Robots on Archive.org. He chose the subject matter of flowers and robots because "I find the idea of an artificial intelligence painting robots to be simultaneously humorous and endearing."

"A painting by Norman Rockwell of a robot head with flowers growing out of the top with clouds and a rainbow in a background, digital art"

The technology seems adept at understanding the artistic logic of the prompt, both in terms of style and content. But there are a few incongruous elements, such as the weird red cable that arcs over to the rainbow.

Mr. Barry says: "While the model is capable of generating other types of images, I found paintings to be the area where it truly excelled aesthetically."

"A colorful painting by M.C. Escher of robot head with flowers growing out of the top"

The survey of styles resembles a Society of Illustrators exhibition or a professional illustrators' workbook. The foregoing two pages don't strike me as particularly reminiscent of Rockwell or Escher, but to me they score quite high on internal coherence and aesthetic appeal.

Right now only a few people have access to this tool, but presumably it will soon be widely available essentially for free.

"a dramatically lit brightly colored detailed painting of a robot artist painting a picture"

The power of this artificial intelligence gives me a mixture of feelings: I'm surprised, delighted, intimidated, and a bit breathless at the speed of the progress.

If you are an illustrator or gallery artist who paints surrealistic images in your particular style, it's a good time for soul-searching.

You might consider:

1. How you would use these tools.

2, How you will provide value for clients who have these tools.

3. How you will create artwork that these tools can't accomplish.

This system of artificial intelligence won't eliminate traditional human artists—(and by "traditional" I include digital artists with those who use physical paint.)

But it will send shock waves through the illustration world, and it will replace a lot of jobs. Soon, anyone and everyone will be able to create images easily, cheaply, and quickly with simple prompts of natural language.

Learn more about Ben Barry's book called 1000 Robots at Archive.org

Saturday, April 9, 2022

How Smart is Dall-E 2?

Prompt: “Polymer clay dragons eating pizza in a boat”

Computer-generated image (Dall-e 2 by OpenAI)

For a several years now, computers have been able to generate images based on a natural-language prompt.

The resulting images have suffered from problems of logic and global coherence.

For example, here's what you get if you give the computer the prompt “A rabbit detective sitting on a park bench and reading a newspaper in a Victorian setting.” (Latent Diffusion LAION-400M via @loretoparisi)

Where are his legs? His hands? Are those books or newspapers? Is that a coffee table in front of his bench?

The image doesn't make sense, and we might conclude that the problem comes from the computer not having any experience of living in a body or dealing with the real world. No matter how big the data sets, or how many layers of processing you bring to the task, you can't get past that limitation.

Or can you?

Open AI is one of the pioneers of generating realistic images and art from descriptions in natural language. They recently unveiled new software called Dall-e 2, which has pushed the boundaries of what's possible with this technology.

Here's what Dall-E 2 does with the same prompt: “A rabbit detective sitting on a park bench and reading a newspaper in a Victorian setting.”

The overall logic is much better. Now he has legs and is really sitting on that bench, even casting a shadow. But the image is still not perfect. What's the black loop in his left hand? And why doesn't he seem to be holding the newspaper with his right hand?

Here's one more example of how the technology is improving, using the prompt “teddy bears working on new AI research on the moon in the 1980s”

The first version using older tech (laion400m) looks like a paste-up of unrelated elements.

Here's what Dall-e 2 came up with: a pretty believable image with consistent lighting.

Open AI released this YouTube video to introduce the sofware.

This technology scares some working artists and illustrators. @VividVoid says: "DALL-E is breaking my heart. AI art is about to lay utter waste to traditional visual art forms. This will be so much more destructive than what the Internet did to music. It will be a technological conquest of one of the great human avenues of spiritual transformation."

AI skeptic Gary Marcus doubts whether the technology will ever replace artists because it is just crunching big data sets. It's not learning from embodied experience, nor does it understand symbolic or semantic concepts the way a human does. Marcus says: "This whole thread is weaponized cherry-picked PR; the antithesis of science."

Soon after Dall-E2 was released, OpenAI gave me beta access to try it out. On this YouTube video, I share my first experiments with it. (Link to YouTube)

The Verge

Dall-e 2 at OpenAI

Podcast: Gary Marcus: Toward a Hybrid of Deep Learning and Symbolic AI

Wednesday, December 8, 2021

Painting an Abandoned House -- in CGI

If you paint in traditional media you may not pay much attention to tutorials about 3D computer graphics.

I hope you'll make an exception for this demo by Andrew Price showing how to create an abandoned house in with the computer graphics software called Blender.

Price does a great job not only explaining the steps he takes, but also the thinking behind the steps. As a traditional painter I'm fascinated by all the tools and tweaks.

Here's a 60 second version if you're pressed for time.

Monday, November 8, 2021

Google Cloud Vision

Google Cloud Vision is a free service that lets you harness the power of machine learning to analyze images.

You can upload any picture. The algorithm will then compare the image to a vast database of labeled pictures and then make its best guess about what objects it sees.

In this Tom Lovell illustration, GCV is very certain that it sees a single cat, and it's relatively certain that it sees a person. No mention of the other cat, the knitting, the blue chair and the white sweater.

What happens if you give it a fantasy image that doesn't exist in the real world, such as a renegade warrior astride a Styracosaurus with a T.rex-tooth-helmet holding a saber-tooth cat skull on a staff? In this Dinotopia image it recognizes two generalized objects: "a person and an animal."

Clicking on the "labels" tab, you can see that it identifies general qualities of the image with decreasing certainty. It's wrong about hunting and it's wrong about a working animal, but it knows that it's an illustration of an extinct animal.

What happens if you input an image that has no analog in the real world because the image was itself generated by a machine-learning algorithm? Can it find something in the DNA of the image that could help it identify the word prompt that generated the image?

This picture was created by (VQGAN+Clip) with the prompt "Constructionist Typography." The properties that it finds are more general than that, but it's in the ballpark.

Try Google Cloud Vision yourself and let me know in the comments what you discover.

Thursday, October 28, 2021

Mapping the Fruit-Fly Brain

Scientists have succeeded in mapping the neurons and connections of a fruit fly brain.

"A population of neurons that is responsible for updating the fly’s internal compass."

FlyEM/Janelia Research Campus / New York Times

According to the New York Times, "their speck-size brains are tremendously complex, containing some 100,000 neurons and tens of millions of connections, or synapses, between them....The work, which is continuing, is time-consuming and expensive, even with the help of state-of-the-art machine-learning algorithms. But the data they have released so far is stunning in its detail, composing an atlas of tens of thousands of gnarled neurons in many crucial areas of the fly brain."

Sunday, July 25, 2021

Using "By James Gurney" as a Style Prompt

A couple weeks ago I shared the results of some text-to-image experiments.

Code wizards have been using machine-learning tools such as VQGAN + CLIP and BigSleep to create novel images that grow spontaneously from word prompts.

Erfurt Latrine Disaster (Twitter @ErfurtLatrine) Prompt: "Towers" #VQGAN+#CLIP

The prompts can be simple, such as "Towers."

jbusted @jbusted1 "Forbidden Lands 5"

....Or the prompts can evoke a particular role-playing game, such as "Forbidden Lands."

The results develop a unusual style if you add a descriptor naming a studio, portfolio website, or rendering software, such as "from Studio Ghibli" or "trending on ArtStation" or "rendered in Unreal Engine"

"The Grand Hall of the Sacred Library by James Gurney"

To my fascination and delight, some of them have gotten interesting results by including the phrase "by James Gurney."

dzryk @dzryk "The tech bubble bursting by James Gurney"

Twitter user Ryan Moulton @moultano created a set of related images starting with the phrase 'The Hermit Alchemist’s and varying only the style cue:

'The Hermit Alchemist’s Hut by James Gurney'

'The Hermit Alchemist’s Hut rendered in Unreal Engine'.

'The Hermit Alchemist’s Hut by Van Gogh'

Matteo Vera @MatteoVera1 VQGAN + CLIP

"A castle built on the skeleton of a dead god by James Gurney"

Ryan Moulton @moultano "In the Woods, Gouache Painting."

Using the phrase "In the Woods + Gouache Painting" (without an artist's name) yields something that appears painted in water media, like a Mary Blair concept painting, but with something weird about the kids' faces.

Ryan Moulton @moultano "In the Woods by James Gurney"

All of the results have issues of basic logic and perspective. They never make sense or seem fully coherent, at least not yet.

But some of them do suggest a recognizable style. Does this look like my style to you? I'm not sure; it feels both familiar and alien. It almost looks like something from a long lost sketchbook.

---

Blog post by Ryan Moulton "Tour of the Sacred Library" using my name in the prompts.

Thursday, July 8, 2021

New Tools for Text-to-Image Generation

Generating an image from a line of text entirely by means computer algorithms has been possible for the last few years. Newly invented tools are yielding results that keep getting more interesting.

The images can be hauntingly surrealistic, such as this one, which was generated by the phrase “when the wind blows.”

Image courtesy The Big Sleep (source: @advadnoun on Twitter)

It's a little blurry and out of focus, with tendrils of downy fluff waving in dim light. It seems more like a photograph than a painting, but really it's a new category of image, made by computer software drawing from big data sets.

Lately people's imaginations have been captured by tools such as VQ-GAN and CLIP.

Prompt: “a face like an M.C. Escher drawing” from The Big Sleep (source: @advadnoun on Twitter)

Some of the results are compelling and intriguing, seemingly intelligent in a weird non-human way, as if you're looking into an alien's mind. Is that a face on its side, an eye, a nose, a mouth? Are those textures fingerprints?

Prompt: “The Yellow Smoke That Rubs Its Muzzle On The Window-Panes”

from VQ-GAN+CLIP (source: @RiversHaveWings on Twitter)

Each solution has a visual logic of theme and variation that's carried throughout the image. It's certainly not random.

Prompt: “A Series Of Tubes” from VQ-GAN+CLIP (source: @RiversHaveWings on Twitter)

Many of the images from this system have a surrealistic patchwork appearance resembling Cubism, where extracted fragments are juxtaposed across the picture plane, but the 3D space doesn't make sense as a real scene.

(source: @ak92501 on Twitter)

Some of the creativity of this enterprise derives from the odd juxtapositions of the words in the prompts. The results are often effective with long prompts. The phrase for the image above is “a small hut in a blizzard near the top of a mountain with one light turn on at dusk trending on artstation | unreal engine”

In recent weeks, people writing prompts realized you can get the system to yield a more detailed style if you say "trending on artstation."

Prompt: "matte painting of someone reading papers and burning the midnight oil | trending on artstation"

by Twitter user @ak92501

I expect that with time the results will be accepted alongside human efforts, beginning perhaps with categories like motel art, Twitter avatars, and corporate clip art. They will take their place on Instagram alongside painters and photographers. Many of the innovators in this field write their own code and come up with remarkably creative prompts, so it makes sense to think of them as artists.

As a viewer, I'm not quite sure how to respond emotionally to something that looks like art, but which didn't pass through a human consciousness.

As an artist, I'm not worried about my job. Maybe it's a vain hope, but I feel like people will always want to see images made by a human hand and filtered through a human brain rather than one made by an unfeeling machine. The question is whether eventually we'll be able to tell the difference.

Thanks, Chris!

Resources to learn more:

• UC Berkeley blog post, which is a good overview of techniques: Alien Dreams: An Emerging Art Scene

• Scientific paper (Free PDF) "Taming Transformers for High-Resolution Image Synthesis"

• Twitter account "Images.AI" which plays with these natural language prompts and some of the same tools.

Sunday, June 6, 2021

Ian Hubert's 'Dynamo Dream'

Ian Hubert spent about three years developing this short film called Dynamo Dream.

The first episode called Salad Mug, is set in a lived-in science-fiction future.

Hubert is a 3D digital artist, who creates his worlds mostly all by himself, but the visual effects are as impressive as a Hollywood film.

I was attracted to the relaxed tone and pacing, but I just wish it started immediately with stronger visuals and a clear, engaging story.

His minute-long "Lazy Tutorials" are popular with digital artists, but they might make sense to traditional painters as well.

Sunday, March 7, 2021

Bringing Old Photos to Life

Old photos provide a window to life in the past. A great deal of information is contained in those photos, but a lot of visual data has been lost, too—not just the color, but other features such as the subsurface scattering.

A couple of recent digital innovations have helped to bring old photos and paintings to life. There's a lot you can do with Photoshop, but there are limits to what you can accomplish with denoising, colorization, and superresolution.

The result here has reduced some of the cragginess of the original Lincoln photo and made him look younger, but presumably that could be dialed differently.

'Time Travel Rephotography' is a technique for recreating the natural, full-color appearance based on the the original photograph and an input photo of a contemporary person. The metrics of the modern person are shifted to match that of the historic person.

The way to test this method would be to take a photo of a contemporary person using an antique process and see if you could restore the missing information to match a high-res photo of that person.

Another digital reconstruction tool is My Heritage, an app that takes a photographic input, or even old paintings or statues, and animates them with blinks and turns (Link to YouTube video).

Because it draws power from large data sets, the results have some convincing nuances, such as the movement of bags under the eyes. I think it would actually be more effective if the movements were more limited and subtle.

Combining these techniques and animating them with a motion-captured actor's performance would yield even better results.

----

More about Time Travel Rephotography on Two Minute Papers

Thanks, Mel and Roger

Sunday, December 20, 2020

Generating a Flyover from a Single Image

Computer scientists have developed software that generates a drone-like flyover video based on a single input image. (Link to YouTube)

---

Scientific paper: Infinite Nature: Perpetual View Generation of Natural Scenes from a Single Image

Wednesday, November 4, 2020

Siggraph's 2020 Demos

Siggraph is a conference of computer graphics research, now held virtually. Their Asia subchapter has just shared some of the new technical papers demonstrating new techniques for digital animation and graphics. (Link to YouTube)

Here's another recent video (Link to YouTube) with highlights of their main conference. There continues to be remarkable progress in surface flow dynamics, secondary actions on deformable objects, and artistic style transfer to video source animation.

These brief demos serve as a preview of digital tools and techniques that will filter down to individual artists, commercial cameras, and visual effects in movies. As a traditional painter, I'm fascinated to learn how these scientists analyze and reproduce familiar phenomena of the visual world

Sunday, June 21, 2020

New App Adds Detail to Blurry Image

New software is able to take a low resolution image and add missing detail.

The tool supplies missing information using a generative adversarial network. It draws on a big data set to generate a plausible looking face that matches the pixellated version.

Researchers at Duke University who developed it describe the process this way: "The system scours AI-generated examples of high-resolution faces, searching for ones that look as much as possible like the input image when shrunk down to the same size."

The resulting face is photographically detailed, and it fits the initial pixellated image, but it's really only one of several possible solutions.

-----

Articles about the process from Hypebeast and Techxplore

Thursday, January 16, 2020

Wall Street Journal's Hedcuts

Top row: hand-drawn portraits; Bottom row: computer-generated versions.

The Wall Street Journal has developed an artificial-intelligence system for creating its distinctive 'hedcut' portraits.

Human-created hedcut of Grumpy Cat, 2013, courtesy Wall Street Journal

Hedcuts have a wood-engraving or scratchboard look, made up of dots and dashes.

Left: human-created 'hedcut' of actress Chloë Grace Moretz
Right: AI-created hedcut, courtesy Wall Street Journal

The AI learned the style and produced adequate results in most cases. But there were difficulties. One challenge was "teaching the tool to render hair and clothes differently than skin, which was often a matter of whether to crosshatch versus stipple."

Error cases caused by AI working with too limited set of data,
courtesy Wall Street Journal

"The most harrowing issue of all was overfitting, which happens when a model fits a limited set of data too closely. In this case, that meant the machine became too satisfied with its artistic ability and began producing terrifying monstrosities like these."

----

Read more on Wall Street Journal: What’s in a Hedcut? Depends How It’s Made.

The Artist's Guide to Sketching

James Gurney

Blog Index

Blog Archive

Tip Jar

Color and Light Book

Imaginative Realism

Other Official Sites

Illustration

Painting and Painters

Drawing & Cartooning

Animation Art

CG Art

Contact

Permissions