Conversation
Notices
-
Hey @p , last I knew, you were working on something big, code-wise, that you were getting close to releasing. I wasn't clear what it was, but it sounded interesting
-
@p @TradeMinister >human readability suffers
That's why you have to put "b00b" and "babe" on there.
$ dig +short ryona.agency AAAA
2605:6400:10:1fe:dead:babe:b00b:cafe
-
@p
@TradeMinister
>> Sounds like C++; maybe that kind of bloat was in the air then.
> One of the objections to RFC 1597 that were laid out in RFC 1627, "Network 10 Considered Harmful", was that private networks would make it take longer to exhaust the IPv4 address space, which was about to be exhausted any minute now (where "now" meant "1994"), so the sky has been falling since IPv6 was called "IPng", and the IPv6 guys were trying to make the sky fall faster so that the perfect system could replace the good one.
Nice. Perfect... Good... Sounds familiar somehow...;)
Well, one nice thing about ipv4 is that a human can remember and write 127.0.0.1, whereas ipv6 addresses look machine readable only. I suppose it is necessary for one's refrigerator and microwave and desk clock to have their own ip addresses... Or maybe none of that shit belongs on the Net, and fuck the IoT.
> Otherwise, a VM and a debugger.
Yeah, that was my suspicion, more or less: it's easier to simulate a CPU than to interpret C, and without a lot of benefit: you've still got to wander through all of the files, parse them, etc. That was the slowest part of the compiler until the gcc team ran out of stuff to do and decided they could "optimize" by making 30 passes that squeeze out almost nothing and does dumb shit like rewriting printf to puts (seriously, they do this, and they mangle the string constant in the binary and this turns one syscall into two, which is more pessimization than optimization).
I heard about that bs 'optimization'. Fscking c++ mindset.
I wonder if anyone ever made a register-transfer-language emulator. gcc first compiles to RTL, which looks a lot like an assembly language, then does its optimizations on the RTL. Seems like if one could single-step an RTL machine, not optimize, and have comments in the RTL linking back to the code line by line, one could have something like emacs+gcc 'playing' the code, and showing changes to vars as they happen. Of course, emulating some of the sketchy stuff one does systems coding, like say hand-building a thread and feeding it to a scheduler, emulating that could get interesting. But normal, user-mode C could probably be emulated.
> I can't say I ever really saw the point, but I was almost never in userspace anyway.
:craylol: Chads live in Ring 0. :chuckmoore2:
;) The real thrill was microcode: everything was mission-critical, and if something goes wrong, the machine either freezes or starts producing random values. No debugging other than reading source. Definitely a young man's sport. One really had to be able to single-step the (parallel) hardware in one's head. And modern architectures with branch-prediction and such... Maybe they have emulators LOL.
I'll look at the technical stuff when I feel smarter, meaning after coffee.
-
@TradeMinister
> Well, one nice thing about ipv4 is that a human can remember and write 127.0.0.1, whereas ipv6 addresses look machine readable only.
IPv6's equivalent is "::1", which is almost all zeroes, which feels wrong. But the "::" notation, where you pick a run of zeroes and that's where the "::" goes, that's really upsetting, not just that implementations are going to be error-prone but human readability suffers. 2620:0:863:ed1a::1, 2607:f8b0:4007:818::200e, 2606:4700:3032::6815:20f3, 2606:4700:3037::ac43:bc4e, all real IPv6 addresses, your eyes glaze over.
> I suppose it is necessary for one's refrigerator and microwave and desk clock to have their own ip addresses... Or maybe none of that shit belongs on the Net, and fuck the IoT.
Even 64-bit addresses would have allowed for a publicly routeable address for more than the number of chips that have been manufactured or that *can* be manufactured once we've scraped every grain of copper from the earth: it allows for 2.4 b-b-b-billion addresses per living human. 64-bit addresses are excessive, but I could understand it. 128-bit addresses (4.2e28 addresses per human, so each human could assign 20,000 addresses to every star in the known universe) go way past the laughable "and we won't have to ever fix anything again because this time the protocol is perfect!" approach and into the realm of parody.
> I heard about that bs 'optimization'. Fscking c++ mindset.
Not just the C++ mindset, but now also C++ programmers: a couple of years ago they snuck C++ into gcc's source code, so the C compiler no longer written in C.
> The real thrill was microcode
I should correct it to "Chads *implement* Ring 0 for kernel programmers and other mortals."
-
@TradeMinister
> Sounds like C++; maybe that kind of bloat was in the air then.
One of the objections to RFC 1597 that were laid out in RFC 1627, "Network 10 Considered Harmful", was that private networks would make it take longer to exhaust the IPv4 address space, which was about to be exhausted any minute now (where "now" meant "1994"), so the sky has been falling since IPv6 was called "IPng", and the IPv6 guys were trying to make the sky fall faster so that the perfect system could replace the good one.
> Otherwise, a VM and a debugger.
Yeah, that was my suspicion, more or less: it's easier to simulate a CPU than to interpret C, and without a lot of benefit: you've still got to wander through all of the files, parse them, etc. That was the slowest part of the compiler until the gcc team ran out of stuff to do and decided they could "optimize" by making 30 passes that squeeze out almost nothing and does dumb shit like rewriting printf to puts (seriously, they do this, and they mangle the string constant in the binary and this turns one syscall into two, which is more pessimization than optimization).
> I can't say I ever really saw the point, but I was almost never in userspace anyway.
:craylol: Chads live in Ring 0. :chuckmoore2:
> It really was. Best job I ever had, except for being a river guide, but it paid a lot better. Doing that 9 months of the year, then guiding in Summer: that would be hard to beat. Unless there's a surf-guiding gig. That might be better.
Ha, definitely ideal.
-
@TradeMinister Your critiques of IPv6, should you get around to reading it, are probably better informed than mine are. There's some nice stuff in there, but there's also some stuff that I wish they hadn't, some stuff that they speculated they'd want in the 1990s and then expanded on over the years without any negative feedback. (The easy target is the address notation.)
> like a guy who AFAIK worked on nothing but a C interpreter the whole time I was there and AFAIK never got it working
People have this vague notion that a C interpreter ends up being useful for a lot of things. I think it seems like a very large amount of trouble and I can't identify the payoff. Apparently CERN had one and there was one knocking around Bell Labs, but I can't find it.
> My wing was just engineers
Idyllic.
-
@p
@TradeMinister
> Your critiques of IPv6, should you get around to reading it, are probably better informed than mine are.
Probably not; I was never a TCP/IP stack guy.
> There's some nice stuff in there, but there's also some stuff that I wish they hadn't, some stuff that they speculated they'd want in the 1990s and then expanded on over the years without any negative feedback. (The easy target is the address notation.)
Sounds like C++; maybe that kind of bloat was in the air then.
> People have this vague notion that a C interpreter ends up being useful for a lot of things. I think it seems like a very large amount of trouble and I can't identify the payoff. Apparently CERN had one and there was one knocking around Bell Labs, but I can't find it.
If one wants a C interpreter, play with Javascript instead. Otherwise, a VM and a debugger. I can't say I ever really saw the point, but I was almost never in userspace anyway.
Guy doing it was this weird hairy little guy, maybe Dan something. Maybe he went off to CERN or Bell Labs and got it working.
> > My wing was just engineers
> Idyllic.
It really was. Best job I ever had, except for being a river guide, but it paid a lot better. Doing that 9 months of the year, then guiding in Summer: that would be hard to beat. Unless there's a surf-guiding gig. That might be better.
-
@TradeMinister
> X.25 was a great fantasy protocol, clearly written by Eurocrat philisopers. They actually tasked me, by myself, with implementing it! I hadn't thought about it, but recently realized that this was nuts:
I've never implemented it, but I have heard a lot of complaints; I have been interested in AX.25 lately, SDR having become ridiculously cheap. (Even when they get a chance to see what has been done elsewhere, they do this: they look at NSTC and decide to invent PAL.)
> Fortunately I had already decided to leave, and just stalled them while I worked on leaving the kernel and ucode in the best state I could.
Pride in craftsmanship is too rare!
> Strictly blue sky: a good and maybe easy way to test both code and protocol might be to have VMs jabbering at each other.
In fact, I've started this.
> Eventually you'd have to simulate bad actors trying to meddle, but that would be down the road.
The best part is that I don't have to! On fedi, you can let the bad actors come to you.
> TCP/IP was written to survive a nuclear war; it was a pleasing model of pragmatism and efficiency.
IPv6 was the result of 30 years of daydreaming; it shows.
> It sounds like what you are working on is to ruggedize the Fedi protocol against cyber warfare; the RFC will be interesting!
I think I might get a paper out of it, too, if I'm lucky. (I don't know if you caught it or not, but FSE technically has been cited in a paper presented to the ACM, but it was more anthropological, some people examining the moderation system that had cropped up around fedi.)
-
@p
I haven't read IPv6, but 30 years of daydreaming doesn't sound good. The people who wrote IPv4 and TCP were very good, but pragmatic, practical, "what works and will keep working" engineers, I think. Some were probably at BBN, but 5 or 10 years before me.
MIT guys, I would guess. That's how it read to me, anyway.
BBN supported some dreamers, like a guy who AFAIK worked on nothing but a C interpreter the whole time I was there and AFAIK never got it working, but he was probably up in the Tower, where they did actual research. He was probably a Harvard guy. My wing was just engineers, more MIT types.
-
@p X.25 was a great fantasy protocol, clearly written by Eurocrat philisopers. They actually tasked me, by myself, with implementing it! I hadn't thought about it, but recently realized that this was nuts: I didn't even do network code, Steve Dyer handled the TCP/IP stack and I did the rest. (I just read the RFCs for pleasure: it was like a well-designed machine architecture, which I also read for aesthetic enjoyment).
Fortunately I had already decided to leave, and just stalled them while I worked on leaving the kernel and ucode in the best state I could.
Strictly blue sky: a good and maybe easy way to test both code and protocol might be to have VMs jabbering at each other. Eventually you'd have to simulate bad actors trying to meddle, but that would be down the road.
TCP/IP was written to survive a nuclear war; it was a pleasing model of pragmatism and efficiency.
It sounds like what you are working on is to ruggedize the Fedi protocol against cyber warfare; the RFC will be interesting!
-
@p
I quite enjoyed the original TCP/IP RFCs. Back in the day, if you were on ARPAnet, DARPA would just send them to you. I may still have copies somewhere. Anyways, I considered them excellent examples of how to design and document a protocol (and X.25 a perfect example of how not to).
Revolver is first and foremost a protocol; the implementations are secondary. In practice, the two will evolve together. I'd enjoy seeing your protocol docs when you get around to writing them.
-
@TradeMinister
> I quite enjoyed the original TCP/IP RFCs. Back in the day, if you were on ARPAnet, DARPA would just send them to you.
That is very cool! Plan 9 apparently ships with a script that syncs your local copy of the RFCs, so after the initial large chunk of updates, you get /lib/rfc/rfc822 and whatnot, and you can just grep. It's insanely convenient. Aside from that, a lot of the early ones are a good read: some of them show some great foresight or they're interesting for historical reasons, and a handful of them are quaint, like that student that announced a program he wrote.
> In practice, the two will evolve together.
Yeah; if try to write that kind of thing down before it works and it will inevitably become a fantasy protocol that can't be implemented properly. (Getting it to talk to ActivityPub servers has been an adventure.) The decoding has been stable enough that I think the wire format won't budge again, but it's easier to justify technical decisions when you can say "We tried that, this is what happened, this is why we can't do it that way." (Working code is the ultimate defense of an architectural decision, code that *continues* working is the ultimate defense of a design decision.)
-
@TradeMinister Yep, not done yet, plus I have to pay bills so my time is split at present.
-
@mint @TradeMinister Neckbeard used to have something fun in the address, but I forget what it was and I can't look it up now because neckbeard.xyz got revoked.
-
@TradeMinister Ouch.
We do have two Draculae: :dracula: and :dracula2:.
-
@TradeMinister
> Butbutbut space! And galaxies and shit! Ancient aliens!
Europa's 1200-baud line is gonna take an entire second just for the address header on these packets.
> I don't know if he was involved in ObjC, but I think it would be more to his liking, and Apple (much as I love to hate them) made a top-notch choice in adopting it instead of C++.
Yeah, seconded. I wonder why Objective-C isn't widely used outside Apple. Message-passing, very little BS. (Funny story, someone got ahold of the old cfront source and ported it, you can now run 1980s-style C++ on Plan 9. I don't know if anyone's gotten it to work on Linux.)
> I'd be shocked and saddened to hear he blessed moving gcc to c++, or allowing 'optimizations' like transforming printf() to puts().
It's parts of gcc, not the whole thing, but once you've invited the vampire into your house, none of your daughters are safe. As far as the questionable optimizations, I don't know who's responsible, but it's the kind of bug that makes me write off an entire organization. I don't think they measure, they just overdesign; maybe whoever did the work on -Os was smarter, it frequently outperforms -O3 since it can keep the whole damn program in icache if you're not linking against anything embarrassing. Even on ARM (where gcc's output is actually atrocious; I don't think anyone's working on it, all the world's an x86-64), without the massive caches, it tends to produce better code (and more readable code, to boot).
> On the BBN C machine, it was interesting: start load/store, do other things for 3 cycles, data now available.
Did you have to schedule or do the nop-padding yourself, or did you just get a delay if you did a load, add, then store immediately? Like if you had an inner loop that did, like
load r0, [r1]
add r1, r1, #4
store [r2], r0
add r2, r2, #4
Would you have to add a couple of nops after the load, or would the machine handle the delay if you didn't?
The VLIW stuff is interestion, Itanium had some facilities like that, but that chip existed entirely to scare shareholders at Sun and DEC, and now high-end workstations are all x86. The way they implemented it was to do explicit slots, it was something like 128-bit instructions and you could do a load or a store in one of them and pack what you wanted into the other ones, . (I may be hallucinating this, but I think there was a MIPS design that worked with slots. The GreenArrays chips do.) Compiler writers seem to have trouble with this kind of thing but I'd rather have a bad compiler on a chip that doesn't reorder instructions.
> The superfast whizzo chips with their pipelines and hyperthreading and such probably still spend lots of their time idling waiting for all... those... nanoseconds to pass before memory access complete.
You might be surprised: it's something like 90%. 90% of the time spent in a normal server load with a normal server CPU is waiting on the memory bus. So then the chips are designed to make up a fantasy world "Let's say that branch gets taken" and then speculatively execute it" and then if the branch doesn't get taken, toss it out and recompute it. So the register file gets fatter, power usage goes up because computers now make up work for themselves if they are bored waiting, all this extra space on the die dedicated to something that turns out to be fraught with Spectres and Meltdowns and whatnot.
> I used to overclock a bit, and found that boosting one's cpu frequency a lot mattered much less than boosting the bus speed a little, except for CPU benchmarks.
Not surprising, I think; I usually can't tell the difference if I underclock the CPU. These little ARM systems, I usually run them with the CPU at the minimum clock, it's like 200MHz, 400MHz, and until I try to use a browser, I can't tell. It gets hot in the summer, I do the same thing to my desktop system's CPU sometimes, same results.
-
@p
Speaking of which, I was well into a long response, got to your 'vampire' and decided to see what 'colon v' (I dare not type it) would get me, and Husky just trashed the response. I'll try again later.
-
@p
@TradeMinister
> > Yeah, 64 bits sounds fine, and might even be human-readable.
> Not super readable, but readable enough, and fits in a register, and more addresses than Earth will need.
Butbutbut space! And galaxies and shit! Ancient aliens!
> > Hard to imagine rms allowing this. I guess he must have handed gcc to a subsequent generation.
> I'm not privy.
Me either. But I did sort of know him a little bit (you probably would have too, if you were in my location). And I got the sense that he was a pretty old-skool guy in terms of K&R V7 etc design philosophy: human-readable, minimalist. While he did invent Emacs Lisp and lived in the MIT AI lab, I can't imagine him inventing the C++ abortion. I don't know if he was involved in ObjC, but I think it would be more to his liking, and Apple (much as I love to hate them) made a top-notch choice in adopting it instead of C++.
I'd be shocked and saddened to hear he blessed moving gcc to c++, or allowing 'optimizations' like transforming printf() to puts().
> Yeah; I think overall, load/store was a good thing.
On the BBN C machine, it was interesting: start load/store, do other things for 3 cycles, data now available.
> I think, for the complication, a better memory bus would have been a wiser investment than the mess of a pipeline we have now.
Exactly. Busses are key. I believe one of the key parts of the Xen design is something they call like a 'fabric' or something, anyways a chip-internal very intelligent bus or network of busses.
It's not for nothing they used to call it data processing: unless one is doing AI or heavy number-crunching, one is usually plowing thru lots of data. The superfast whizzo chips with their pipelines and hyperthreading and such probably still spend lots of their time idling waiting for all... those... nanoseconds to pass before memory access complete. And that's not even thinking about contention between cores, locking, etc.
I used to overclock a bit, and found that boosting one's cpu frequency a lot mattered much less than boosting the bus speed a little, except for CPU benchmarks.
-
@TradeMinister
> Yeah, 64 bits sounds fine, and might even be human-readable.
Not super readable, but readable enough, and fits in a register, and more addresses than Earth will need.
> Hard to imagine rms allowing this. I guess he must have handed gcc to a subsequent generation.
I'm not privy.
> In one instruction, you might simultaneously tell the ALU to add, start a memory read or write, tell the shift-rotate register to do something
Yeah; I think overall, load/store was a good thing.
> none of this speculative execution stuff which has turned out to be problematic.
That's an understatement. I'm not even sure it even buys you much; I think, for the complication, a better memory bus would have been a wiser investment than the mess of a pipeline we have now.
-
@p
@TradeMinister
> Even 64-bit addresses would have allowed for a publicly routeable address for more than the number of chips that have been manufactured or that *can* be manufactured once we've scraped every grain of copper from the earth: it allows for 2.4 b-b-b-billion addresses per living human. 64-bit addresses are excessive, but I could understand it. 128-bit addresses (4.2e28 addresses per human, so each human could assign 20,000 addresses to every star in the known universe) go way past the laughable "and we won't have to ever fix anything again because this time the protocol is perfect!" approach and into the realm of parody.
Yeah, 64 bits sounds fine, and might even be human-readable.
> Not just the C++ mindset, but now also C++ programmers: a couple of years ago they snuck C++ into gcc's source code, so the C compiler no longer written in C.
Hard to imagine rms allowing this. I guess he must have handed gcc to a subsequent generation.
> > The real thrill was microcode
> I should correct it to "Chads *implement* Ring 0 for kernel programmers and other mortals."
Machines, at least the minis I knew about and the one I worked on, were simpler then. In one instruction, you might simultaneously tell the ALU to add, start a memory read or write, tell the shift-rotate register to do something, but there was none of this speculative execution stuff which has turned out to be problematic.
Writing instruction emulation ucode for modern chips must be really hard compared to what we did back then.
-
@p
Human readability, as opposed to machine readability, was a core design rule in the good old days. Good times.
-
@TradeMinister "You won't have to worry about that, the computer does it for you!" Then the lives of the people that build the computer to do it for you get worse.
-
@p
> Yep, I've got a solution to this. Block collisions, when the block size is small enough (8kB), should make the storage requirements go log-scale.
You'd have to dumb this down for me. Block collisions, which I assume meaning two data blocks hashing to the same thing, sound like a crash of the protocol to me.
> Aside from that, there's a tuneable GC process (in progress), basically doing slow-motion mark/sweep, then evicting LRU. (Full nodes don't want to GC anything, and for normal nodes, mark what that node has published, then what that node is interested in, and the rest is transient data.)
Yeah, good, so a normal node is locally maintaining a hot data cache. But if full nodes never GC, storage requirements grow to infinity, but perhaps in practice no faster than storage limits towards infinity, and we have a 'which infinity is larger' question.
> > inode is created to link to some or all of the data in the dataspace
> You think this might work better than just walking the tree?
The inode struct is just what I am most used to. But there are all manner of ways of keeping track of the actual data objects that underly abstract objects. I'm sure you've already come up with better ideas about that than I'm likely to. And there's all the history, NFS and such, to look at.
> One of the issues is that there are the top-level objects assembled from blocks, and then there is this slush of blocks.
The abstract objects, and the messy data.
> (Six queues: explicit outgoing, signed outgoing, network outgoing, explicit incoming, signed incoming, network incoming, all managed somewhat defensively.)
This would need dumbing down for me to comment.
> So a large number of unaddressed blocks moving through is the norm, then there's a heuristic for which ones we don't need when GC pressure happens.
Need definition of 'unaddressed': transiting network-chatter not addressed by objects of local interest , or...?
> > so the protocol can decide about how many duplicates of a block (there should obviously be redundancy) to keep around, and where to keep them
> The protocol has to assume the entire network is hostile; I think this precludes most classes of group-level decision.
There I go, thinking we're on the Arpanet, a high-trust low-security environment.
> Essentially, what we're doing is more or less a lightly tweaked Kademlia,
Did a quick read: this sounds right.
> nodes have a cooperation score for other nodes (key-based; that is, independent of the means by which we talk to that node). Ideally, your node has all of the things you've published and are interested in, plus mirrors of other blocks that eventually get swept if no one cares about them.
Sounds good. Cooperation presumably meaning a node doesn't violate protocol, answers requests in a timely fashion, doesn't produce bad data, doesn't show signs of trying to crack the protocol.
> > possibility of incompletion has to be contemplated.
> That's something we have to deal with, yeah.
NFS amusingly enough decided not to, being 'stateless'. I remember I was at an Olivetti (they actually did computers in Europe) conference in Florence in '86, and the NFS people were presenting their protocol, and I saw a window or race condition (actually as one can imagine, a 'stateless' protocol is prone to them) and pointed it out, and their answer was that it happened yet. Not sure, but I think years later I read that yeah actually it *did* happen, and wasn't good.
So you have to think about things like multiple writers to the same addresses in the same object colliding, and writes silently failing out there somewhere, unless you have locking in the protocol, and some kind of handshake where a write isn't a write until wherever the data lives says it is. I'm sure people much smarter than I have thought long and hard. NFS, as one might expect of the BSD people (I got *really stoned* in their hotel room at a DC Unix conference in maybe '84) opted for fast and simple over slow, complex and reliable, and in practice it mostly worked. OK for almost anything, but maybe not real mission-critical stuff.
But I think that's the basic tradeoff. And now that I think of what the Fedi is, and how no one will die if an old meme gets lost, fast and simple might be the right choice.
-
@p I'd like to read that ACM paper.
I was lying in bed after a nap, and thinking about the object management, or filesystem if you prefer, that might underlie a decentralized, ruggedized Fedi, and I got to thinking about recounts and access times. The flaw I saw in the fs (dataspace?) you described at one point is that blocks are never deleted: the space just grows to infinity, and most of it is dead garbage eventually.
Thus, refcounts. When a path is created in the namespace, and an inode is created to link to some or all of the data in the dataspace, the inode gets a refcount of 1, which ++s with additional paths to it and --ed on unlink (this part is standard *nix fs; I'm not addressing decentralizing this yet).
In standard *nix, upon the last unlink() setting inode recount to zero, all of its data is freed. In a decentralized protocol, it might be better if the data objects (blocks, extents) also have recounts, for a number of reasons. First, if the recounts go to zero, the data can be deleted, or placed in dead storage, candidate for deletion.
Secondly, and a reason also for access times, is so the protocol can decide about how many duplicates of a block (there should obviously be redundancy) to keep around, and where to keep them. High refcount, recently accessed data should be kept in fast storage with many copies, and low refcount unaccessed storage, off to the morgue with it.
It will make quoting/duplicating/retweeting an object more costly, because in addition to at least creating a new path in namespace and upping the refcount in the inode, the inode level of the protocol has to send out updates to all the data objects, and the protocol either has to handshake, meaning every action has to be acked by recipient, or possibility of incompletion has to be contemplated.
-
@TradeMinister
> the space just grows to infinity, and most of it is dead garbage eventually.
Yep, I've got a solution to this. Block collisions, when the block size is small enough (8kB), should make the storage requirements go log-scale. Aside from that, there's a tuneable GC process (in progress), basically doing slow-motion mark/sweep, then evicting LRU. (Full nodes don't want to GC anything, and for normal nodes, mark what that node has published, then what that node is interested in, and the rest is transient data.)
> inode is created to link to some or all of the data in the dataspace
You think this might work better than just walking the tree? One of the issues is that there are the top-level objects assembled from blocks, and then there is this slush of blocks. (Six queues: explicit outgoing, signed outgoing, network outgoing, explicit incoming, signed incoming, network incoming, all managed somewhat defensively.) So a large number of unaddressed blocks moving through is the norm, then there's a heuristic for which ones we don't need when GC pressure happens.
> so the protocol can decide about how many duplicates of a block (there should obviously be redundancy) to keep around, and where to keep them
The protocol has to assume the entire network is hostile; I think this precludes most classes of group-level decision. Essentially, what we're doing is more or less a lightly tweaked Kademlia, nodes have a cooperation score for other nodes (key-based; that is, independent of the means by which we talk to that node). Ideally, your node has all of the things you've published and are interested in, plus mirrors of other blocks that eventually get swept if no one cares about them.
> possibility of incompletion has to be contemplated.
That's something we have to deal with, yeah.