What's the deal with AI art?

Rextreff@lemmygrad.ml · 3 months ago

What's the deal with AI art?

pcalau12i@lemmygrad.ml · 3 months ago

A lot of computer algorithms are inspired by nature. Sometimes when we can’t figure out a problem, we look and see how nature solves it and that inspires new algorithms to solve those problems. One problem computer scientists struggled with for a long time is tasks that are very simple to humans but very complex for computers, such as simply converting spoken works into written text. Everyone’s voice is different, and even those same people may speak in different tones, they may have different background audio, different microphone quality, etc. There are so many variables that writing a giant program to account for them all with a bunch of IF/ELSE statements in computer code is just impossible.

Computer scientists recognized that computers are very rigid logical machines that computer instructions serially like stepping through a logical proof, but brains are very decentralized and massively parallelized computers that process everything simulateously through a network of neurons, whereby its “programming” is determined by the strength of the neural connections between the neurons, that are analogue and not digital and only produce approximate solutions and aren’t as rigorous as a traditional computer.

This led to the birth of the artificial neural network. This is a mathematical construct that describes a system with neurons and configurable strengths of all its neural connections, and from that mathematicians and computer scientists figured out ways that such a neural network could also be “trained,” i.e. to configure its neural pathways automatically to be able to “learn” new things. Since it is mathematical, it is hardware-independent. You could build dedicated hardware to implement it, a silicon brain if you will, but you could also simulate it on a traditional computer in software.

Computer scientists quickly found that applying this construct to problems like speech recognition, they could supply the neural network tons of audio samples and their transcribed text and the neural network would automatically find patterns in it and generalize from it, and when new brand audio is recorded it could transcribe it on its own. Suddenly, problems that at first seemed unsolvable became very solvable, and it started to be implemented in many places, such as language translation software also is based on artificial neural networks.

Recently, people have figured out this same technology can be used to produce digital images. You feed a neural network a huge dataset of images and associated tags that describe it, and it will learn to generalize patterns to associate the images and the tags. Depending upon how you train it, this can go both ways. There are img2txt models called vision models that can look at an image and tell you in written text what the image contains. There are also txt2img models which you can feed it a description of an image and it will generate and image based upon it.

All the technology is ultimately the same between text-to-speech, voice recognition, translation software, vision models, image generators, LLMs (which are txt2txt), etc. They are all fundamentally doing the same thing, just taking a neural network with a large dataset of inputs and outputs and training the neural network so it generalizes patterns from it and thus can produce appropriate responses from brand new data.

A common misconception about AI is that it has access to a giant database and the outputs it produces are just stitched together from that database, kind of like a collage. However, that’s not the case. The neural network is always trained with far more data that can only possibly hope to fit inside the neural network, so it is impossible for it to remember its entire training data (if it could, this would lead to a phenomena known as overfitting which would render it nonfunctional). What actually ends up “distilled” in the neural network is just a big file called the “weights” file which is a list of all the neural connections and their associated strengths.

When the AI model is shipped, it is not shipped with the original dataset and it is impossible for it to reproduce the whole original dataset. All it can reproduce is what it “learned” during the training process.

When the AI produces something, it first has an “input” layer of neurons kind of like sensory neurons, such as, that input may be the text prompt, may be image input, or something else. It then propagates that information through the network, and when it reaches the end, that end set of neurons are “output” layers of neurons which are kind of like motor neurons that are associated with some action, lot plotting a pixel with a particular color value, or writing a specific character.

There is a feature called “temperature” that injects random noise into this “thinking” process, that way if you run the algorithm many times, you will get different results with the same prompt because its thinking is nondeterministic.

Would we call this process of learning “theft”? I think it’s weird to say it is “theft,” personally, it is directly inspired by biological systems learn, of course with some differences to make it more suited to run on a computer but the very broad principle of neural computation is the same. I can look at a bunch of examples on the internet and learn to do something, such as look at a bunch of photos to use as reference to learn to draw. Am I “stealing” those photos when I then draw an original picture of my own? People who claim AI is “stealing” either don’t understand how the technology works or just reach to the moon claiming things like it doesn’t have a soul or whatever so it doesn’t count, or just pointing to differences between AI and humans which are indeed different but aren’t relevant differences.

Of course, this only applies to companies that scrape data that really are just posted publicly so everyone can freely look at, like on Twitter or something. Some companies have been caught scraping data illegally that were never put anywhere publicly, like Meta who got in trouble for scraping libgen, which a lot of stuff on libgen is supposed to be behind a paywall. However, the law already protects people who get their paywalled data illegally scraped as Meta is being sued over this, so it’s already on the side of the content creator here.

Even then, I still wouldn’t consider it “theft.” Theft is when you take something from someone which deprives them of using it. In that case it would be piracy, when you copy someone’s intellectual property for your own use without their permission, but ultimately it doesn’t deprive the original person of the use of it. At best you can say in some cases AI art, and AI technology in general, can based on piracy. But this is definitely not a universal statement. And personally I don’t even like IP laws so I’m not exactly the most anti-piracy person out there lol