Pluralis Research (@Pluralis) / X

Pluralis Research

175 posts

Pluralis Research

@Pluralis

Pluralis is a research lab focused on collectively-owned AI.

Joined July 2024

Pinned
Pluralis Research
@Pluralis
May 20
Today we're releasing Agora: the first ever pretraining stack that allows non-collocated consumer GPUs to be competitive with centralized clusters Agora is 15x faster than Megatron-LM in this setting and is only 1.5x less efficient in terms of tokens per unit compute than
72K
Pluralis Research reposted
Alexander Long
@AlexanderLong
Jun 9
The single most immediate, impactful downside to AI is concentration of power risk. This is 1/100th of how bad it's going to get. The only way out of this is to have an independent model supply chain via pooled compute.
elie
@eliebakouch
Jun 9
mythos will be bad ON PURPOSE on ai "frontier llm research" tasks, this is very very sad for the research community also the fact that this is un purpose not visible to the user is crazy
3.6K
Pluralis Research
@Pluralis
Jun 8
00:00
7.4K
Pluralis Research reposted
Alexander Long
@AlexanderLong
May 31
Whole section on Pluralis in Chamath's substack this week right under details of Anthropic's monster round.
16K
Pluralis Research reposted
crux
@macrocrux
May 22
Something very important is being brought into existence right now. Bricks have been laid over the last 18 months and now the tech is coming together in a way that makes commercialization possible. If this shit works, it will completely disrupt the economics of training large
Pluralis Research
@Pluralis
May 20
Today we're releasing Agora: the first ever pretraining stack that allows non-collocated consumer GPUs to be competitive with centralized clusters Agora is 15x faster than Megatron-LM in this setting and is only 1.5x less efficient in terms of tokens per unit compute than
9.4K
Pluralis Research reposted
kel.
@kelxyz_
May 22
Article
Pluralis: The Last Revolutionary AI Protocol.
There are few occasions in life when you have the opportunity to contribute to something meaningful. With that said: Remember the name. Pluralis. It has the potential to define many cycles to come....
288K
Pluralis Research
@Pluralis
May 20
Replying to @Pluralis
More details, code, and join instructions: pluralis.ai/docs A live view of the run that just started: agora.pluralis.ai
3.2K
Pluralis Research
@Pluralis
May 20
Replying to @Pluralis
There are still many limitations: - Agora is not yet self-serve - We currently cannot use nodes outside North America - We’re restricted to our current model architecture - We still need to operate approximately 30% of the swarm ourselves However, the fact that this works at all
3.6K
Pluralis Research
@Pluralis
May 20
Replying to @Pluralis
Because of the multi-party design, the system has several other notable properties; - Tolerance to nodes dropping at any point - Ability to use spot instances during training - Ability to train across datacenters and even across providers - Full device heterogeneity, including
1.2K
Pluralis Research
@Pluralis
May 20
Replying to @Pluralis
In addition to unlocking a large amount of latent capacity, consumer cards are cheap. The average differences in cost are on the order of 3x per FLOP, with many listings over 10x.
1.4K
Pluralis Research
@Pluralis
May 20
Replying to @Pluralis
We estimate there are over 9.6M such devices currently running. In total, this is almost 4GW of productive compute that is otherwise not able to used for large-scale ML workloads.
1.4K
Pluralis Research
@Pluralis
May 20
Replying to @Pluralis
Agora is launching with a consumer card only 8b pretrain currently underway and open for joining. The raw computational power of consumer cards is surprisingly good (2.4x 5090s have equivalent FP16 throughput to a H100), but they are hobbled in their networking and hence unable
1.8K
Pluralis Research
@Pluralis
May 20
Replying to @Pluralis
Agora is a new type of distributed training system; one that has been designed from the ground up to be multi-party. It is built on over a year of focused research and brings an enormous swath of compute that previously was not usable for large model pretraining, online.
2.3K