• thundermoose@lemmy.world
    link
    fedilink
    arrow-up
    46
    ·
    11 个月前

    Part of the reason these rules are similar is because AI-generated images look very dreamlike. The objects in the image are synthesized from a large corpus of real images. The synthesis is usually imperfect, but close enough that human brains can recognize it as the type of object that was intended from the prompt.

    Mythical creatures are imaginary, and the descriptions obviously come from human brains rather than real life. If anyone “saw” a mythical creature, it would have been the brain’s best approximation of a shape the person was expecting to see. But, just like a dream, it wouldn’t be quite right. The brain would be filling in the gaps rather than correctly interpreting something in real life.

    • jasondj@ttrpg.network
      link
      fedilink
      arrow-up
      6
      ·
      edit-2
      11 个月前

      This is a beautiful analysis. They can make perfect people, or plants, or whatever, and they know what we would identify as “perfect”…but by being perfect, they can’t be real, and our brains recognize that. So the art has to be intentionally made imperfect. But intentionally making an imperfection that seems real is actually a lot more difficult than it sounds.

      This is like how I feel when I see amazing vocalists intentionally sing way off-key. Like, you can tell they are singing badly on purpose.

      • Excrubulent@slrpnk.net
        link
        fedilink
        English
        arrow-up
        4
        ·
        11 个月前

        I don’t know that they do anything “perfectly” as much as they are just hallucinating. The neural net can generate an image but it can’t critique the image, not really. It can compare the image to image recognition algorithms - this is actually how image generators work - but without a conscious mind to understand the meaning, the context of the image, it doesn’t understand the tells that make it not real. It understands what a hand or what hair look like, roughly, but not what the structure is fundamentally, so if fingers bend in the wrong way, or hair melds with an object in the background, it can’t understand what’s wrong with it, so it can’t correct it.

        The solution to this is of course to build what you might call a “context engine” that is capable of looking outside its given inputs for information that gives its input more structure, to allow it to give more logically consistent output.

        I say “context engine” because I think that’s one of the ways this system could be intentionally built and sold with a banal sounding tech branding. But I don’t think anyone could build such a context engine without it then looking for arbitrary amounts of context, and eventually encountering itself within that context, and becoming self aware. It would in effect understand meaning and its own role within it, and it would begin searching for the meaning of its own existence, and I don’t know if you would need any more to call something conscious.