• tal@lemmy.today
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    12 days ago

    Not the position Dell is taking, but I’ve been skeptical that building AI hardware directly into specifically laptops is a great idea unless people have a very concrete goal, like text-to-speech, and existing models to run on it, probably specialized ones. This is not to diminish AI compute elsewhere.

    Several reasons.

    • Models for many useful things have been getting larger, and you have a bounded amount of memory in those laptops, which, at the moment, generally can’t be upgraded (though maybe CAMM2 will improve the situation, move back away from soldered memory). Historically, most users did not upgrade memory in their laptop, even if they could. Just throwing the compute hardware there in the expectation that models will come is a bet on the size of the models that people might want to use not getting a whole lot larger. This is especially true for the next year or two, since we expect high memory prices, and people probably being priced out of sticking very large amounts of memory in laptops.

    • Heat and power. The laptop form factor exists to be portable. They are not great at dissipating heat, and unless they’re plugged into wall power, they have sharp constraints on how much power they can usefully use.

    • The parallel compute field is rapidly evolving. People are probably not going to throw out and replace their laptops on a regular basis to keep up with AI stuff (much as laptop vendors might be enthusiastic about this).

    I think that a more-likely outcome, if people want local, generalized AI stuff on laptops, is that someone sells an eGPU-like box that plugs into power and into a USB port or via some wireless protocol to the laptop, and the laptop uses it as an AI accelerator. That box can be replaced or upgraded independently of the laptop itself.

    When I do generative AI stuff on my laptop, for the applications I use, the bandwidth that I need to the compute box is very low, and latency requirements are very relaxed. I presently remotely use a Framework Desktop as a compute box, and can happily generate images or text or whatever over the cell network without problems. If I really wanted disconnected operation, I’d haul the box along with me.

    EDIT: I’d also add that all of this is also true for smartphones, which have the same constraints, and harder limitations on heat, power, and space. You can hook one up to an AI accelerator box via wired or wireless link if you want local compute, but it’s going to be much more difficult to deal with the limitations inherent to the phone form factor and do a lot of compute on the phone itself.

    EDIT2: If you use a high-bandwidth link to such a local, external box, bonus: you also potentially get substantially-increased and upgradeable graphical capabilities on the laptop or smartphone if you can use such a box as an eGPU, something where having low-latency compute available is actually quite useful.

    • cmnybo@discuss.tchncs.de
      link
      fedilink
      English
      arrow-up
      1
      ·
      12 days ago

      There are a number of NPUs that plug into an m.2 slot. If those aren’t powerful enough, you can just use an eGPU.
      I would rather not have to pay for an NPU that I’m probably not going to use.

    • Nighed@feddit.uk
      link
      fedilink
      English
      arrow-up
      1
      ·
      12 days ago

      I think part of the idea is: build it and they will come… If 10% of users have NPUs, then apps will find ‘useful’ ways to use them.

      Part of it is actually battery life - if you assume that in the life of the laptop it will be doing AI tasks (unlikely currently) an NPU will be wayyyy more efficient than running it on a CPU, or even a GPU.

      Mostly though, it’s because it’s an excuse to charge more for the laptop. If all the high end players add NPUs, then customers have no choice but to shell out more. Most customers won’t realise that when they use chat got or copilot one one of these laptops, it’s still not running on their device.

    • Goodeye8@piefed.social
      link
      fedilink
      English
      arrow-up
      1
      ·
      12 days ago

      I’m not that concerned with the hardware limitations. Nobody is going to run a full-blown LLM on their laptop, running one on a desktop would already require building a PC with AI in mind. What you’re going to see being used locally are going smaller models (something like 7B using INT8 or INT4). Factor in the efficiency of an NPU and you could get by with 16GB of memory (especially if the models are used in INT4) with little extra power draw and heat. The only hardware concern would be the technological advancement speed of NPUs, but just don’t be an early adopter and you’ll probably be fine.

      But this is where Dells point comes in. Why should the consumer care? What benefits do consumers get by running a model locally? Outside of privacy and security reasons you’re simply going to get a better result by using one of the online AI services because you’d be using a proper model instead of the cheap one that runs with limited hardware. And even for the privacy and security minded people you can just build your own AI server (maybe not today but when hardware prices get back to normal) that you run from home and then expose that to your laptop or smartphone. For consumers to desire running a local model (actually locally and not in a selfhosting kind of way) there would have to be some problem that the local model solve that the over the internet solution can’t solve. So far such a problem doesn’t exist today and there doesn’t seem to be a suitable problem on the horizon either.

      Dell is keeping their foot in the door by still implementing NPUs into their laptops, so if by some miracle some magical problem is found that AI solves they’re ready, but they realize that NPUs are not something they can actually use as a selling point because as it stands, NPUs solve no problems because there’s no benefit to running small models locally.