▲Lock-Free Rust: How to Build a Rollercoaster While It's on Fireyeet.cx

133 points by r3tr0 56 days ago | 60 comments

MobiusHorizons 53 days ago [-]

I enjoyed the content, but could have done without the constant hyping up of the edginess of lock free data structures. I mean yes, like almost any heavily optimized structure there are trade offs that prevent this optimization from being globally applicable. But also being borderline aroused at the “danger” and rule breaking is tiresome and strikes me as juvenile.

lesser23 53 days ago [-]

The bullet points and some of the edge definitely smell like LLM assistance.

Other than that I take the other side. I’ve read (and subsequently never finished) dozens of programming books because they are so god awfully boring. This writing style, perhaps dialed back a little, helps keep my interest. I like the feel of a zine where it’s as technical as a professional write up but far less formal.

I often find learning through analogy useful anyway and the humor helps a lot too.

bigstrat2003 53 days ago [-]

To each their own. I thought it was hilarious and kept the article entertaining throughout, with what would otherwise be a fairly dry subject.

atoav 53 days ago [-]

It is juvenile, but what do we know? Real Men use after free, so they wouldn't even use Rust to begin with.

The edgy tones sound like from an LLM to me..

Animats 53 days ago [-]

I've done a little bit of "lock-free" programming in Rust, but it's for very specialized situations.[1] This allocates and releases bits in a bitmap. The bitmap is intended to represent the slots in use in the Vulkan bindless texture index, which resides in the GPU. Can't read that those slots from the CPU side to see if an entry is in use. So in-use slots in that table have to be tracked with an external bitmap.

This has no unsafe code. It's all done with compare and swap. There is locking here, but it's down at the hardware level within the compare and swap instruction. This is cleaner and more portable than relying on cross-CPU ordering of operations.

[1] https://github.com/John-Nagle/rust-vulkan-bindless/blob/main...

gpderetta 53 days ago [-]

The claim that the lock free array is faster then the locked variant is suspicious. The lock free array is performing a CAS for every operation, this is going to dominate[1]. A plain mutex would do two CAS (or just one if it is a spin lock), so the order of magnitude difference is not explainable by the lock free property.

Of course if the mutex array is doing a linear scan to find the insertion point that would explain the difference but: a) I can't see the code for the alternative and b) there is no reason why the mutex variant can't use a free list.

Remember:

- Lock free doesn't automatically means faster (still it has other properties that might be desirable even if slower)

- Never trust a benchmark you didn't falsify yourself.

[1] when uncontended; when contended cache coherence cost will dominate over everything else, lock-free or not.

bonzini 53 days ago [-]

Yes the code for the alternative is awful. However I tried rewriting it with a better alternative (basically the same as the lock free code, but with a mutex around it) and was still 40% slower. See FixedVec in https://play.rust-lang.org/?version=stable&mode=release&edit...

gpderetta 53 days ago [-]

Given twice the number of CASs, about twice as slow is what I would expect for the mutex variant when uncontended. I don't know enough rust to fix it myself, but could you try with a spin lock?

As the benchmark is very dependent on contention, it would give very different results if the the threads are scheduled serially as opposed to running truly concurrently (for example using a spin lock would be awful if running on a single core).

So again, you need to be very careful to understand what you are actually testing.

michaelscott 53 days ago [-]

For applications doing extremely high rates of inserts and reads, lock free is definitely superior. In extreme latency sensitive applications like trading platforms (events processing sub 100ms) it's a requirement; locked structures cause bottlenecks at high throughput

j_seigh 53 days ago [-]

I did a lock-free ABA-free bounded queue in c++ kind of an exercise. I work mostly with deferred reclamation schemes (e.g. refcounting, quiescent state based reclamation, and epoch based reclamation). A queue requiring deferred reclamation, like the Michael-Scott lock-free queue is going to perform terribly so you go with an array based ring buffer. It uses a double wide CAS to do the insert for the enqueue and a regular CAS to update the tail. Dequeue is just a regular CAS to update the head. That runs about 57 nsecs on my 10th gen i5 for single producer and consumer.

A lock-free queue by itself isn't very useful. You need a polling strategy that doesn't involve a busy loop. If you use mutexes and condvars, you've basically turned it into a lock based queue. Eventcounts work much better.

If I run more threads than CPUs and enough work so I get time slice ends, I get about 1160 nsecs avg enq/deq for mutex version, and about 146 nsecs for eventcount version.

Timings will vary based on how man threads you use and cpu affinity that takes your hw thread/core/cache layout into consideration. I have gen 13 i5 that runs this slower than my gen 10 i5 because of the former's efficiency cores even though it is supposedly faster.

And yes, a queue is a poster child for cache contention problems, une enfant terrible. I tried a back off strategy at one point but it didn't help any.

convivialdingo 53 days ago [-]

I tried replacing a DMA queue lock with lock-free CAS and it wasn't faster than a mutex or a standard rwlock.

I rewrote the entire queue with lock-free CAS to manage insertions/removals on the list and we finally got some better numbers. But not always! We found it worked best either as a single thread, or during massive contention. With a normal load it wasn't really much better.

sennalen 53 days ago [-]

The bottleneck is context switching

scripturial 52 days ago [-]

So don’t bother to optimize anything?

r3tr0 53 days ago [-]

totally valid.

that benchmarking is something i should have added more alternatives to.

0x1ceb00da 53 days ago [-]

> AtomicUsize: Used for indexing and freelist linkage. It’s a plain old number, except it’s watched 24 / 7 by the CPU’s race condition alarm.

Is it though? Aren't atomic load/store instructions the actual important thing. I know the type system ensures that `AtomicUsize` can only be accessed using atomic instructions but saying it's being watched by the CPU is inaccurate.

eslaught 53 days ago [-]

I'm not sure what the author intended, but one way to implement atomics at the microarchitectural level is via a load-linked/store-conditional pair of instructions, which often involves tracking the cache line for modification.

https://en.wikipedia.org/wiki/Load-link/store-conditional

It's not "24/7" but it is "watching" in some sense of the word. So not entirely unfair.

tombert 53 days ago [-]

Pretty interesting.

I have finally bitten the bullet and learned Rust in the last few months and ended up really liking it, but I have to admit that it's a bit lower level than I generally work in.

I have generally avoided locks by making very liberal use of Tokio channels, though that isn't for performance reasons or anything: I just find locks really hard to reason about for anything but extremely trivial usecases, and channels are a more natural abstraction for me.

I've never really considered what goes into these lock-free structures, but that might be one of my next "unemployment projects" after I finish my current one.

forgot_old_user 53 days ago [-]

definitely! Reminds me of the golang saying

> Don't Communicate by Sharing Memory; Share Memory by Communicating

https://www.php.cn/faq/1796714651.html

tombert 53 days ago [-]

Yeah, similarly, Joe Armstrong (RIP), co-creator of Erlang explained it to me like this:

> In distributed systems there is no real shared state (imagine one machine in the USA another in Sweden) where is the shared state? In the middle of the Atlantic? — shared state breaks laws of physics. State changes are propagated at the speed of light — we always know how things were at a remote site not how they are now. What we know is what they last told us. If you make a software abstraction that ignores this fact you’ll be in trouble.

He wrote this to me in 2014, and it has really informed how I think about these things.

throwawaymaths 53 days ago [-]

The thing is that go channels themselves are shared state (if the owner closes the channel and a client tries to write you're not gonna have a good time)! Erlang message boxes are not.

tombert 53 days ago [-]

Strictly speaking they’re shared state, but the way you model your application around channels is generally to have independent little chunks of work and the channels are just a means of communicating. I know it’s not one-for-one with Erlang.

53 days ago [-]

Kubuxu 53 days ago [-]

You can think of closing the channel as sending a message “there will be no further messages”, the panic on write is enforcement of that contract.

Additionally the safe way to use closing of a channel is the writer closing it. If you have multiple writers, you have to either synchronise them, or don’t close the channel.

throwawaymaths 51 days ago [-]

Sure but the fact that it is shared state is why you can't naively have a go channel that spans a cluster but Erlang's "actor" system works just fine over a network and the safety systems (nodedowns, monitors etc) are a simple layer on top.

kbolino 53 days ago [-]

You don't have to close a channel in Go and in many cases you actually shouldn't.

Even if you choose to close a channel because it's useful to you, it's not necessarily shared state. In a lot of cases, closing a channel behaves just like a message in its queue.

aatd86 53 days ago [-]

Isn't entanglement in quantum physics the manifestation of shared state? tongue-in-cheek

psychoslave 53 days ago [-]

Maybe. Or maybe we observe the same point of information source from two points which happen to be distant in the 3-coordinates we are accustomized to deal with, but both close to this single point in some other.

aatd86 47 days ago [-]

but that still means that there is shared state on the projection from the higher tensor space.

orthogonality needs to be valid for all subspaces.

gpderetta 53 days ago [-]

> Don't Communicate by Sharing Memory; Share Memory by Communicating

that's all well and good until you realize you are reimplementing a slow, buggy version of MESI in software.

Proper concurrency control is the key. Shared memory vs message passing is incidental and application specific.

revskill 53 days ago [-]

How can u be unemployed ?

psychoslave 53 days ago [-]

By the default state of any entity in universe which is to not be employed?

tombert 53 days ago [-]

Just the market. I don’t have a lot of reasons outside of that.

zero0529 53 days ago [-]

Like the writing style but would prefer if it was dialed down maybe 10 %. Otherwise a great article as an introduction to lock-free datastructures.

Fiahil 53 days ago [-]

You can go one step further if :

- you don't reallocate the array

- you don't allow updating/ removing past inserted values

In essence it become a log, a Vec<OnceCell<T>> or a Vec<UnsafeCell<Option<T>>>. Works well, but only for a bounded array. So applications like messaging, or inter-thread communication are not a perfect fit.

It's a fixed-size vector that can be read at the same time it's being written to. It's no a common need.

gmm1990 53 days ago [-]

Is the advantage of a freelist over just an array of the values (implemented like a ring buffer), that you can don't have to consume values in order? It just seems like throwing a pointer lookup would add a lot of latency for something thats so latency sensitive.

jillesvangurp 53 days ago [-]

This is the kind of stuff that you shouldn't have to reinvent yourself but be able to reuse from a good library. Or the standard library even.

How would this compare to the lock free abstractions that come with e.g. the java.concurrent package? It has a lot of useful primitives and data structures. I expect the memory overhead is probably worse for those.

Support for this is one of the big reason Java and the jvm has been a popular choice for companies building middleware and data processing frameworks for the last few decades. Exactly the kind of stuff that the author of this article is proposing you could build with this. Things like Kafka, Lucene, Spark, Hadoop, Flink, Beam, etc.

gpderetta 53 days ago [-]

> This is the kind of stuff that you shouldn't have to reinvent yourself but be able to reuse from a good library. Or the standard library even.

Indeed; normally we call it the system allocator.

A good system allocator will use per thread or per cpu free-lists so that it doesn't need to do CAS loops for every allocation though. At the very least will use hashed pools to reduce contention.

r3tr0 56 days ago [-]

hope you enjoy this article on lock free programming in rust.

I used humor and analogies in the article not just to be entertaining, but to make difficult concepts like memory ordering and atomics more approachable and memorable.

nmca 53 days ago [-]

Did you get help from ChatGPT ooi? The humour sounds a bit like modern ChatGPT style but it’s uncanny valley.

bobbyraduloff 53 days ago [-]

at the very least that article was definitely edited with ChatGPT. i had someone on my team write “edgy” copy with ChatGPT last week and it sounded exactly the same. short paragraphs and overuse of bullet points are also a dead giveaway. i don’t think it’s super noticeable if you don’t use ChatGPT a lot but for the people that use these systems daily, it’s still very easy to spot.

my suggestion to OP: this was interesting material, ChatGPT made it had to read. use your own words to explain it. most people interested in this deeply technical content would rather read your prompt than the output.

zbentley 52 days ago [-]

As someone who overused bullet points before it was AI-cool and doesn’t write with the assistance of AI (not due to a general anti-AI belief, I just like writing by hand) I have also started getting that feedback a lot lately.

Who knows, maybe someone accidentally over-weighted my writing by a factor of a trillion in ChatGPT’s training set?

r3tr0 53 days ago [-]

i had help GPT help with some grammar, editing, and shortening.

The core ideas, jokes, code, and analogies are 100% mine.

Human chaos. Machine polish.

tombert 53 days ago [-]

Interesting read, I enjoyed it and it answered a question that I didn't even realize I had been asking myself for years, which is how lock-free structures work.

Have you looked at CTries before? They're pretty interesting, and I think are probably the future of this space.

jonco217 53 days ago [-]

> NOTE: In this snippet we ignore the ABA problem

The article doesn't go into details but this is subtle way to mess up writing lock free data structures:

https://en.wikipedia.org/wiki/ABA_problem

r3tr0 53 days ago [-]

i will do another one on just the ABA problem and how many different ways it can put your program in the hospital.

IX-103 52 days ago [-]

> We don’t need strict ordering here; we’re just reading a number.

That's probably the most scary sentence in the whole article.

lucraft 53 days ago [-]

Can I ask a dumb question - how is Atomic set operation implemented internally if not by grabbing a lock?

moring 53 days ago [-]

Two things that come to my mind:

1. Sometimes "lock-free" actually means using lower-level primitives that use locks internally but don't expose them, with fewer caveats than using them at a higher level. For example, compare-and-set instructions offered by CPUs, which may use bus locks internally but don't expose them to software.

2. Depending on the lower-level implementation, a simple lock may not be enough. For example, in a multi-CPU system with weaker cache coherency, a simple lock will not get rid of outdated copies of data (in caches, queues, ...). Here I write "simple" lock because some concepts of a lock, such as Java's "synchronized" statement, bundle the actual lock together with guaranteed cache synchronization, whether that happens in hardware or software.

gpderetta 53 days ago [-]

Reminder that lock-free is a term of art with very specific meaning about starvation-freedom and progress and has very little to do with locking.

sophacles 53 days ago [-]

Most hardware these days has intrinsic atomics - they are built into the hw in various ways, both in memory model guarantees (e.g. x86 has a very strong guarantees of cache coherency, arm not so much), and instructions (e.g. xchg on x86). The deatails vary a lot between different cpu architectures, which is why C++ and Rust have memory models to program to rather than the specific semantics of a given arch.

Asraelite 53 days ago [-]

It does use locks. If you go down deep enough you eventually end up with hardware primitives that are effectively locks, although they might not be called that.

The CPU clock itself can be thought of as a kind of lock.

gpderetta 53 days ago [-]

The hardware itself is designed to guarantee it. For example, the core guarantees that it will perform the load + compare + store from a cacheline in a finite number of cycles, while the cache coherency protocol guarantees that a) the core will eventually (i.e. it is fair) be able to acquire the cacheline in exclusive mode and b) will be able to hold it for a minimum number of clock cycles before another core forces an eviction or a downgrade of the ownership.

Havoc 52 days ago [-]

wow - that's a huge speed improvement.

I wonder if this is connected to that rust optimisation bounty post we saw the other day where they couldn't get rust safe decoder closer than 5% to their C implementation. Maybe that's just the cost of safety

rurban 53 days ago [-]

It still wouldn't lead to proper Rust concurrency safety, because their IO is still blocking.

zbentley 52 days ago [-]

What’s blocking IO have to do with this topic?

Also, I don’t feel like that’s true: Rust has the exact same non blocking IO primitives in the stdlib as any other systems language: O_NONBLOCK, multiplexers, and so on. Combined with async/await syntax sugar for concurrency and backends like Tokyo, I’m not sure how you end up at “rust IO is still blocking”.

rurban 52 days ago [-]

Lock-freeness is an important step to concurrency safety. Rust falsely calls itself concurrency safe.

For a system language it's great to finally have a lock free library for some containers, but that's still not safe.

fefe23 53 days ago [-]

To borrow an old adage: The determined programmer can write C code in any language. :-)

MobiusHorizons 53 days ago [-]

Atomics are hardly “C”. They are a primative exposed many CPU ISAs for helping to navigate the complexity those same CPUs introduced with OOO execution and complex caches in a multi-threaded environment. Much like simd atomics require extending the language through intrinsics or new types because they represent capabilities that were not possible when the language was invented. Atomics require this extra support in Java just as they do in rust or C.

pjmlp 53 days ago [-]

That screenshot is very much CDE inspired.

53 days ago [-]

sph 53 days ago [-]

Obligatory video from Jon Gjengset “Crust of Rust: Atomics and Memory Ordering”: https://youtu.be/rMGWeSjctlY?si=iDhOLFj4idOOKby8

curtisszmania 53 days ago [-]

[dead]

gitroom 53 days ago [-]

[dead]

ephemer_a 53 days ago [-]

did 4o write this

Loading comments...

MobiusHorizons 53 days ago [-]

lesser23 53 days ago [-]

The bullet points and some of the edge definitely smell like LLM assistance.

I often find learning through analogy useful anyway and the humor helps a lot too.

bigstrat2003 53 days ago [-]

To each their own. I thought it was hilarious and kept the article entertaining throughout, with what would otherwise be a fairly dry subject.

atoav 53 days ago [-]

It is juvenile, but what do we know? Real Men use after free, so they wouldn't even use Rust to begin with.

The edgy tones sound like from an LLM to me..

Animats 53 days ago [-]

[1] https://github.com/John-Nagle/rust-vulkan-bindless/blob/main...

gpderetta 53 days ago [-]

Remember:

- Lock free doesn't automatically means faster (still it has other properties that might be desirable even if slower)

- Never trust a benchmark you didn't falsify yourself.

[1] when uncontended; when contended cache coherence cost will dominate over everything else, lock-free or not.

bonzini 53 days ago [-]

gpderetta 53 days ago [-]

Given twice the number of CASs, about twice as slow is what I would expect for the mutex variant when uncontended. I don't know enough rust to fix it myself, but could you try with a spin lock?

So again, you need to be very careful to understand what you are actually testing.

michaelscott 53 days ago [-]

j_seigh 53 days ago [-]

If I run more threads than CPUs and enough work so I get time slice ends, I get about 1160 nsecs avg enq/deq for mutex version, and about 146 nsecs for eventcount version.

And yes, a queue is a poster child for cache contention problems, une enfant terrible. I tried a back off strategy at one point but it didn't help any.

convivialdingo 53 days ago [-]

I tried replacing a DMA queue lock with lock-free CAS and it wasn't faster than a mutex or a standard rwlock.

sennalen 53 days ago [-]

The bottleneck is context switching

scripturial 52 days ago [-]

So don’t bother to optimize anything?

r3tr0 53 days ago [-]

totally valid.

that benchmarking is something i should have added more alternatives to.

0x1ceb00da 53 days ago [-]

> AtomicUsize: Used for indexing and freelist linkage. It’s a plain old number, except it’s watched 24 / 7 by the CPU’s race condition alarm.

eslaught 53 days ago [-]

https://en.wikipedia.org/wiki/Load-link/store-conditional

It's not "24/7" but it is "watching" in some sense of the word. So not entirely unfair.

tombert 53 days ago [-]

Pretty interesting.

I have finally bitten the bullet and learned Rust in the last few months and ended up really liking it, but I have to admit that it's a bit lower level than I generally work in.

I've never really considered what goes into these lock-free structures, but that might be one of my next "unemployment projects" after I finish my current one.

forgot_old_user 53 days ago [-]

definitely! Reminds me of the golang saying

> Don't Communicate by Sharing Memory; Share Memory by Communicating

https://www.php.cn/faq/1796714651.html

tombert 53 days ago [-]

Yeah, similarly, Joe Armstrong (RIP), co-creator of Erlang explained it to me like this:

He wrote this to me in 2014, and it has really informed how I think about these things.

throwawaymaths 53 days ago [-]

The thing is that go channels themselves are shared state (if the owner closes the channel and a client tries to write you're not gonna have a good time)! Erlang message boxes are not.

tombert 53 days ago [-]

53 days ago [-]

Kubuxu 53 days ago [-]

You can think of closing the channel as sending a message “there will be no further messages”, the panic on write is enforcement of that contract.

Additionally the safe way to use closing of a channel is the writer closing it. If you have multiple writers, you have to either synchronise them, or don’t close the channel.

throwawaymaths 51 days ago [-]

kbolino 53 days ago [-]

You don't have to close a channel in Go and in many cases you actually shouldn't.

Even if you choose to close a channel because it's useful to you, it's not necessarily shared state. In a lot of cases, closing a channel behaves just like a message in its queue.

aatd86 53 days ago [-]

Isn't entanglement in quantum physics the manifestation of shared state? tongue-in-cheek

psychoslave 53 days ago [-]

aatd86 47 days ago [-]

but that still means that there is shared state on the projection from the higher tensor space.

orthogonality needs to be valid for all subspaces.

gpderetta 53 days ago [-]

> Don't Communicate by Sharing Memory; Share Memory by Communicating

that's all well and good until you realize you are reimplementing a slow, buggy version of MESI in software.

Proper concurrency control is the key. Shared memory vs message passing is incidental and application specific.

revskill 53 days ago [-]

How can u be unemployed ?

psychoslave 53 days ago [-]

By the default state of any entity in universe which is to not be employed?

tombert 53 days ago [-]

Just the market. I don’t have a lot of reasons outside of that.

zero0529 53 days ago [-]

Like the writing style but would prefer if it was dialed down maybe 10 %. Otherwise a great article as an introduction to lock-free datastructures.

Fiahil 53 days ago [-]

You can go one step further if :

- you don't reallocate the array

- you don't allow updating/ removing past inserted values

It's a fixed-size vector that can be read at the same time it's being written to. It's no a common need.

gmm1990 53 days ago [-]

jillesvangurp 53 days ago [-]

This is the kind of stuff that you shouldn't have to reinvent yourself but be able to reuse from a good library. Or the standard library even.

gpderetta 53 days ago [-]

> This is the kind of stuff that you shouldn't have to reinvent yourself but be able to reuse from a good library. Or the standard library even.

Indeed; normally we call it the system allocator.

A good system allocator will use per thread or per cpu free-lists so that it doesn't need to do CAS loops for every allocation though. At the very least will use hashed pools to reduce contention.

r3tr0 56 days ago [-]

hope you enjoy this article on lock free programming in rust.

I used humor and analogies in the article not just to be entertaining, but to make difficult concepts like memory ordering and atomics more approachable and memorable.

nmca 53 days ago [-]

Did you get help from ChatGPT ooi? The humour sounds a bit like modern ChatGPT style but it’s uncanny valley.

bobbyraduloff 53 days ago [-]

zbentley 52 days ago [-]

Who knows, maybe someone accidentally over-weighted my writing by a factor of a trillion in ChatGPT’s training set?

r3tr0 53 days ago [-]

i had help GPT help with some grammar, editing, and shortening.

The core ideas, jokes, code, and analogies are 100% mine.

Human chaos. Machine polish.

tombert 53 days ago [-]

Interesting read, I enjoyed it and it answered a question that I didn't even realize I had been asking myself for years, which is how lock-free structures work.

Have you looked at CTries before? They're pretty interesting, and I think are probably the future of this space.

jonco217 53 days ago [-]

> NOTE: In this snippet we ignore the ABA problem

The article doesn't go into details but this is subtle way to mess up writing lock free data structures:

https://en.wikipedia.org/wiki/ABA_problem

r3tr0 53 days ago [-]

i will do another one on just the ABA problem and how many different ways it can put your program in the hospital.

IX-103 52 days ago [-]

> We don’t need strict ordering here; we’re just reading a number.

That's probably the most scary sentence in the whole article.

lucraft 53 days ago [-]

Can I ask a dumb question - how is Atomic set operation implemented internally if not by grabbing a lock?

moring 53 days ago [-]

Two things that come to my mind:

gpderetta 53 days ago [-]

Reminder that lock-free is a term of art with very specific meaning about starvation-freedom and progress and has very little to do with locking.

sophacles 53 days ago [-]

Asraelite 53 days ago [-]

It does use locks. If you go down deep enough you eventually end up with hardware primitives that are effectively locks, although they might not be called that.

The CPU clock itself can be thought of as a kind of lock.

gpderetta 53 days ago [-]

Havoc 52 days ago [-]

wow - that's a huge speed improvement.

rurban 53 days ago [-]

It still wouldn't lead to proper Rust concurrency safety, because their IO is still blocking.

zbentley 52 days ago [-]

What’s blocking IO have to do with this topic?

rurban 52 days ago [-]

Lock-freeness is an important step to concurrency safety. Rust falsely calls itself concurrency safe.

For a system language it's great to finally have a lock free library for some containers, but that's still not safe.

fefe23 53 days ago [-]

To borrow an old adage: The determined programmer can write C code in any language. :-)

MobiusHorizons 53 days ago [-]

pjmlp 53 days ago [-]

That screenshot is very much CDE inspired.

53 days ago [-]

sph 53 days ago [-]

Obligatory video from Jon Gjengset “Crust of Rust: Atomics and Memory Ordering”: https://youtu.be/rMGWeSjctlY?si=iDhOLFj4idOOKby8

curtisszmania 53 days ago [-]

[dead]

gitroom 53 days ago [-]

[dead]

ephemer_a 53 days ago [-]

did 4o write this