@brucethemoose

brucethemoose@lemmy.world · edit-2 16 hours ago

LG’s recent Exaone release is a pretty awesome local model for the size. A little deep-fried and repetitive, but great for code and stuff, which is especially notable since most decent models tend to only be good at Mandarin Chinese and English.

…Except they slapped an insane license on it. Basically you sign away your life away even looking at it: https://huggingface.co/LGAI-EXAONE/EXAONE-4.0-32B/blob/main/LICENSE

That is not the precedent. Many 32B class (aka 16GB-24GB VRAM GPU) models are Apache licensed. Hence, the ML tinkerer community has pretty much forgotten about it.

brucethemoose@lemmy.world · 1 day ago

I dunno about Linux, but on Windows I used to use something called K10stat to manually undervolt cores with no access to such via the BIOS. The difference was night and day dramatic, as they idled ridiculously fast and AMD left a ton of voltage headroom back then.

I bet there’s some Linux software to do it. Look up if anyone used voltage control software for desktop Phenom IIs and such.

brucethemoose@lemmy.world · edit-2 2 days ago

Yeah, you’re right. My thoughts were kinda uncollected.

Though I will argue some of the negatives (like inference power usage) are massively overstated, and even if they aren’t, are just the result of corporate enshittification more than the AI bubble itself.

Even the large scale training is apparently largely useless: https://old.reddit.com/r/LocalLLaMA/comments/1mw2lme/frontier_ai_labs_publicized_100kh100_training/

brucethemoose@lemmy.world · edit-2 2 days ago

This bubble’s hate is pretty front-loaded though.

Dotcom was, well, a useful thing. I guess valuations were nuts, but it looks like the hate was mostly in the enshittified aftermath that would come.

Crypto is a series of bubbles trying to prop up flavored pyramid schemes for a neat niche concept, but people largely figured that out after they popped. And it’s not as attention grabbing as AI.

Machine Learning is a long running, useful field, but ever since ChatGPT caught investors eyes, the cart has felt so far ahead of the horse. The hate started, and got polarized, waaay before the bubble popping.

…In other words, AI hate almost feels more political than bubble fueled. If that makes any sense. It is a bubble, but the extreme hate would still be there even if it wasn’t.

brucethemoose@lemmy.world · 3 days ago

Waves hands “You didn’t see anything.”

brucethemoose@lemmy.world · edit-2 3 days ago

Neither did Wales. Hence, the next part of the article:

For example, the response suggested the article cite a source that isn’t included in the draft article, and rely on Harvard Business School press releases for other citations, despite Wikipedia policies explicitly defining press releases as non-independent sources that cannot help prove notability, a basic requirement for Wikipedia articles.

Editors also found that the ChatGPT-generated response Wales shared “has no idea what the difference between” some of these basic Wikipedia policies, like notability (WP:N), verifiability (WP:V), and properly representing minority and more widely held views on subjects in an article (WP:WEIGHT).

“Something to take into consideration is how newcomers will interpret those answers. If they believe the LLM advice accurately reflects our policies, and it is wrong/inaccurate even 5% of the time, they will learn a skewed version of our policies and might reproduce the unhelpful advice on other pages,” one editor said.

It doesn’t mean the original process isn’t problematic, or can’t be helpfully augmented with some kind of LLM-generated supplement. But this is like a poster child of a troublesome AI implementation: where a general purpose LLM needs understanding of context it isn’t presented (but the reader assumes it has), where hallucinations have knock-on effects, and where even the founder/CEO of Wikipedia seemingly missed such errors.

Don’t mistake me for being blanket anti-AI, clearly it’s a tool Wikipedia can use. But the scope has to be narrow, and the problem specific.

brucethemoose@lemmy.world · edit-2 3 days ago

Wales’s quote isn’t nearly as bad as the byline makes it out to be:

Wales explains that the article was originally rejected several years ago, then someone tried to improve it, resubmitted it, and got the same exact template rejection again.

“It’s a form letter response that might as well be ‘Computer says no’ (that article’s worth a read if you don’t know the expression),” Wales said. “It wasn’t a computer who says no, but a human using AFCH, a helper script […] In order to try to help, I personally felt at a loss. I am not sure what the rejection referred to specifically. So I fed the page to ChatGPT to ask for advice. And I got what seems to me to be pretty good. And so I’m wondering if we might start to think about how a tool like AFCH might be improved so that instead of a generic template, a new editor gets actual advice. It would be better, obviously, if we had lovingly crafted human responses to every situation like this, but we all know that the volunteers who are dealing with a high volume of various situations can’t reasonably have time to do it. The templates are helpful - an AI-written note could be even more helpful.”

That being said, it still reeks of “CEO Speak.” And trying to find a place to shove AI in.

More NLP could absolutely be useful to Wikipedia, especially for flagging spam and malicious edits for human editors to review. This is an excellent task for dirt cheap, small and open models, where an error rate isn’t super important. Cost, volume, and reducing stress on precious human editors is. It’s a existential issue that needs work.

…Using an expensive, proprietary API to give error prone yet “pretty good” sounding suggestions to new editors is not.

Wasting dev time trying to make it work is not.

This is the problem. Not natural language processing itself, but the seemingly contagious compulsion among executives to find some place to shove it when the technical extent of their knowledge is occasionally typing something into ChatGPT.

It’s okay for them to not really understand it.

It’s not okay to push it differently than other technology because “AI” is somehow super special and trendy.

brucethemoose@lemmy.world · edit-2 4 days ago

Many VPN companies post audits, and build up reputations. Not that I’d recommend it specificlly (since I only use it for a lifetime subscription I bought in a sale), but FastestVPN advertises the former.

…I guess it depends what you’re doing, too. If you’re, like, a government whistleblower, you might want to look into Mullad layered with something else instead of a more traditional commercial provider.

brucethemoose@lemmy.world · 4 days ago

The reduction in coverage was most pronounced before primary elections.

The reduction in staff covering politics made it harder for voters to differentiate between moderates and extremists in partisan primaries, and allowed extreme candidates to do better than they did before.

This makes sense, and it explains a lot, actually.

And to be clear, it’s not just craigslist as a culprit here, but it’s such a controlled A/B test that the effects are reliably measurable.

In the original paper, they also observed reduced turnout for House/Senate elections (which the article didn’t emphasize as much, but is defininitely there): https://academic.oup.com/restud/article/92/3/1738/7665573?login=false#517516514

brucethemoose@lemmy.world · edit-2 4 days ago

That’s what I’m saying, there is no retooling. Some of AMD’s existing OEMs are already making W7900s.

Here’s the bulk of the process on the OEM side, other than maybe leaving an ECC chip off:

Take finished W7900.
Change ID in firmware (so the CAD drivers don’t recognize it)
Apply a different sticker, put it in a different box
Do the paperwork of making a new SKU, like they make for overclocked cards

That’s not that expensive. If it doesn’t sell a lot, well, not much skin off thier back. And it would make AMD boatloads by seeding development for their server cards (which the workstation cards to not do because they are utterly pointless at those prices).

This is all kind of a moot point though, as the 7900 series is basically sunsetted, and AMD doesn’t have a 384 bit consumer card anymore (nor a GDDR7 one to use the new, huge GDDR7 ICs).

brucethemoose@lemmy.world · edit-2 5 days ago

Yes, it is:

https://www.amd.com/en/products/graphics/workstations/radeon-pro/w7900.html

https://dramexchange.com/

16gb GDDR6 ICs are averaging $10 each. The clamshell PCB is already made. So the cost of doubling up the VRAM in a clamshell configuration 7900 XTX (like the W7900) is like $100 at most, on top of this being a seperate memory supply from HMB the datacenter accelerators use. But AMD literally tells its OEMs they are not allowed to sell such clamshell configs of their cards, like they have in the past.

The ostensible business reason is to justify the actual ‘workstation’ cards, which are laughing stocks in that space at those prices.

Hence, AMD is left scratching their heads wondering why no one is developing for the MI325X when devs have literally zero incentive to buy AMD cards to test on.

So if AMD makes a bunch of “AI Accelerators” and nobody buys them because they would rather nVidia (which the video talked about)?

Well, seeing how backordered the Strix Halo Framework Desktop is (even with its relatively mediocre performance), I think this isn’t a big concern.

There is a huge market dying to get out from under Nvidia here. AMD is barely starting to address this with a 32GB model of the 9000 series, but it’s too little too late. That’s not really worth the trouble over a 4090 or 5090, but that calcus changes if the config could be 48GB on a single card like the 7900.

brucethemoose@lemmy.world · 5 days ago

I dunno what your talking, but all AMD has to is this:

Pick up the phone.
Tell their OEMs VRAM restrictions are lifted.
Put it down.

…That’s it.

They’d make seperate SKUs with double the VRAM. AMD doesn’t have to waste a cent.

brucethemoose@lemmy.world · edit-2 6 days ago

The pricing for memory is still pretty bad. $4K for 96GB, $5.6K for 256GB, $10K for 512GB. One can get 128GB on the M4 Max for $3.5K, at the cost of a narrower bus so it’s even slower, but generally, EPYC + a 3090 or 4090 makes a lot more sense.

SOTA quantization for these are mostly DIY. There aren’t many MLX DWQs or trellis-quantized GGUFs floating around.

But if you want to finetune or tinker instead of just run, you’re at an enormous disadvantage there. AMD’s Strix Halo boards are way more compatible, but not standalone yet and kinda rare at this point.

brucethemoose@lemmy.world · edit-2 6 days ago

Funny thing is ‘Local ML’ tinkerers largely can’t afford GPUs in the US either.

The 5090 is ludicrously expensive for its VRAM pool. So is the 4090, which is all but OOS. Nvidia will only sell you a decent-sized pool for $10K. Hence non-techbros here have either been building used RTX 3090 boxes (the last affordable compute GPU Nvidia ever sold), EPYC homelabs for CPU offloading, or have been trying to buy those modded 48GB 4090s back.

The insane supply chain is something like this:

Taiwan GPUs -> China
China GPU boards -> US
US GPU Boards -> Smuggled back into China
Deneutered GPU Boards -> Sold back to US

All because Nvidia is playing VRAM cartel and AMD, inexplicably, is uninterested in competing with it when they could sell 48GB 7900s basically for free.

brucethemoose@lemmy.world · 6 days ago

I don’t think LTT ever did or claimed to do investigative journalism (or really journalism at all)?

But thats how it’s presented, and that’s how many viewers interpet it.

brucethemoose@lemmy.world · 9 days ago

That’s not what I’m saying. They’ve all but outright said they’re unprofitable.

But revenue is increasing. Now, if it stops increasing like they’ve “leveled out”, that is a problem.

Hence it’s a stretch to assume they would decrease costs for a more expensive model since that would basically pop their bubble well before 2029.

brucethemoose@lemmy.world · 9 days ago

Or it might not. It would be a huge short term risk to do so.

As FaceDeer said, that we truly don’t know.

brucethemoose@lemmy.world · 9 days ago

To be fair, OpenAI’s negative profitability has been extensively reported on.

Your point stands though; there’s no evidence they’re trying to decrease revenue. On the contrary, that would be a huge red flag to any vested interests.

brucethemoose@lemmy.world · edit-2 9 days ago

I don’t buy the research paper at all. Of course we have no idea what OpenAI does because they aren’t open at all, but Deepseek’s publish papers suggest it’s much more complex than 1 model per node… I think they recommended like a 576 GPU cluster, with a scheme to split experts.

That, and going by the really small active parameter count of gpt-oss, I bet the model is sparse as heck.

There’s no way the effective batch size is 8, it has to be waaay higher than that.

brucethemoose@lemmy.world · edit-2 10 days ago

Zuckerberg is such a coward.

He bends over backwards for even the slightest change in wind, like VR or a fascist govt. He dropped open-weight llama at the first experimental stumble.

Mark my words, if a mega liberals got into power, he’d fire this guy and act as woke as can be.