More

ujikoluk · on Dec 15, 2024

That seems to show that there exist a and b such that the equality holds. But not that it holds for all a and b.

wrsh07 · on Dec 15, 2024

Which constraints on a,b (besides positive) does this proof require?

xigoi · on Dec 15, 2024

a and b have to be real numbers, whereas the identity works for any commutative ring.

martin293 · on Dec 15, 2024

And how exactly did you come to the conclusion that this is relevant here?

davrosthedalek · on Dec 15, 2024

By the fact that the geometric proof in the link wants to proof the formula, but only does so for a small subset of all a,b for which the formula is correct. This makes it a partial proof, at best.

martin293 · on Dec 15, 2024

Ok nvm I can't resist wasting my time and typing stuff on the internet again, probably gonna regret it later.

How is it not obvious to the dullest of the dull that this visual proof is not supposed to work for goddamn commutative rings lmao

It's probably not even supposed to work for negative reals, 0 or the case b>a. It's supposed to demonstrate the central idea of the visual proof. Also yes, by choosing suitable ways to interpret the lengths shown in the diagrams it's absolutely possible to extend the proof to all reals but I'm not convinced it's meant to be interpreted like that.

But bringing commutative rings into this... man you're funny

phoe-krk · on Dec 15, 2024

> (besides positive)

You can chart a and b on a 2D coordinate system, where they're allowed to be negative. Even positivity is not strictly required here.

wrsh07 · on Dec 15, 2024

I'm not sure I quite understand what the visual proof looks like if they're negative

phoe-krk · on Dec 15, 2024

Check the quadrant which the result ends up in.

davrosthedalek · on Dec 15, 2024

b<a Edit: and b,a element of R, but ok...

notorandit · on Dec 15, 2024

Really? It even holds true for either a=0 or b=0.

Scarblac · on Dec 15, 2024

But not for b > a.

wcrossbow · on Dec 15, 2024

Just rename a to b and b to a.

nuancebydefault · on Dec 15, 2024

Is that allowed? You will prove another equation. You cannot swap pi and e either.

Ma8ee · on Dec 15, 2024

You are not swapping the values, you are swapping the names.

nuancebydefault · on Dec 16, 2024

I forgot the i for irony...

kelnos · on Dec 15, 2024

pi and e aren't names; they're values. Of course you can't swap values.

kleiba · on Dec 15, 2024

Why the downvote? That's a correct argument.

supernewton · on Dec 15, 2024

It is not. a and b are not symmetric in this equation, you can't just swap them.

henry2023 · on Dec 15, 2024

You can swap them without loss of generality (WLOG).

davrosthedalek · on Dec 15, 2024

No, this is not correct. WLOG means: I assume one of the possible cases, but the proof works the same way for other cases. But that's not true here. The proof, as shown, only works for a>b>0, it does not work (without extra work or explanation) for a<b. The proof for a<b is similar, but not the same. [And it certainly does not show it for a,b element of C]

edflsafoiewq · on Dec 15, 2024

WLOG just means the other cases follow from the one case. There is no implication about how hard it is to get to the other cases, although generally it is easy and you don't bother spelling it out exactly.

olddustytrail · on Dec 15, 2024

Of course you can. What do you mean?

Scarblac · on Dec 17, 2024

The answer is going to be negative regardless of the names, so this geometric proof won't work.

davrosthedalek · on Dec 15, 2024

3^2-2^2 =!= 2^2-3^2.

(You can exchange a and b in, say a^2+b^2, because 2^2+3^2=3^2+2^2)

wcrossbow · on Dec 15, 2024

This is not what I meant. What is being proved is: a^2-b^2 - (a+b)(a-b) = 0. If you swap a and b you end up with a sign switch on the lhs which is inconsequential.

davrosthedalek · on Dec 15, 2024

That is not what the proof proves. The proof proves the equivalence how it was originally stated, and assumes for that b<a.

Your rewriting is of course true for all a,b and might be used in an algebraic proof. But this transformation is not at all shown in the geometric proof.

olddustytrail · on Dec 15, 2024

Did you think that I meant you can switch them on one side of the equation but not the other?

That's not what anyone is saying.

davrosthedalek · on Dec 15, 2024

No, of course not.

olddustytrail · on Dec 16, 2024

But that's literally what you just did in your example.

davrosthedalek · on Dec 17, 2024

I did not show the right side at all, so I am not sure how you can make that statement.

The point is that a+b is symmetric in a <-> b and a-b is anti-symmetric. Both left and right side are anti-symmetric.

Ma8ee · on Dec 15, 2024

(a^2 - b^2) = -(b^2 - a^2)

use the same visual proof but with a and b switched to get

-(b + a)(b - a) = (a + b)(a - b)

ujikoluk · on Dec 13, 2024

It is though, at the group level. The groups that adopt will have better survival than groups that don't, if the environment is such that parents regularly die while the cubs are small.

shellfishgene · on Dec 13, 2024

If group selection happens is still much debated among biologists, and not at all resolved!

beeflet · on Dec 13, 2024

Do you think group selection applies to eusocial species such as bees? It must. If so, imagine a species that is incrementally less social than eusocial. Unless there is some threshold degree of sociality required for group selection, then there clearly some level of group selection occurs for other animals (whether or not it's a major effect)

shellfishgene · on Dec 13, 2024

This is usually explained by kin selection, which is different from group selection. I'm not an expert on either though.

ujikoluk · on Dec 1, 2024

Very cool article, love these.

> This brute force approach would work for short codes, but not for long ones. To generate all of the length 10 sequences would require computing about a billion hashes (8^10). That would work on my laptop, but length 11 codes (8 billion hashes) would be take a while, and 12 (68 billion hashes) would be a stretch.

We live in the future though. 68 billion hashes is absolutely possible on a laptop.

echoangle · on Dec 1, 2024

Also, the hash function shown should be easier to do than a „real“ cryptographic hash, right? The hash function looks pretty simple compared to something that’s artificially designed to take a lot of time to compute.

bbayles · on Dec 1, 2024

Yeah, for sure. It's as expensive to generate the permutations as it is to do the hashing in this case!

echoangle · on Dec 1, 2024

And another thing I noticed: because the hash is built Button by button, you can reuse part of the state when checking sequences. So if you’re checking a 10 button sequence, you get all subsequences of that almost for free (just need a comparison after every step). Getting to 18 buttons of length is still a lot of calculation though.

bbayles · on Dec 1, 2024

Good point - the dictionary attack produces some permutations that are too long, but it doesn't matter because you get the effect as soon as the final character of the correct code is entered.

ujikoluk · on Nov 24, 2024

Would be very interested in trying an English-to-English mode to wash away my ESL accent.

ujikoluk · on Nov 20, 2024

Love the design. Also love the attitude, wish we had more clear communication like this.

Would absolutely use this service if I was in need of a logo.

ujikoluk · on Nov 18, 2024

I don't think building an island is the answer. It will make it even difficult to integrate into the culture. And if you don't want to integrate into the culture, why are you there in the first place?

shiroiushi · on Nov 18, 2024

>And if you don't want to integrate into the culture, why are you there in the first place?

A lot of immigrants leave their home countries not because they love the culture of their new country, but because they found living in their old country unbearable for some reason. Or just for economic reasons.

Not everyone actually wants to integrate into a new culture; many don't. Just look at how many people in the US don't speak English, even though that's obviously the dominant culture.

ujikoluk · on Nov 17, 2024

Maybe we should just avoid trying to classify things as good or bad.

ujikoluk · on Nov 10, 2024

See also https://en.wikipedia.org/wiki/Hero's_journey

ujikoluk · on Oct 24, 2024

I have long dreamt about building a portable phased array for this purpose, but additionally using the phase difference between receivers to visualize where the transmission source is.

In effect a camera into the RF world!

transpute · on Oct 24, 2024

Could be combined with AR, https://hackaday.com/2019/01/09/smartphone-app-uses-ar-to-vi...

DIY radio telescope tuned to 2.4Ghz WiFi, https://www.youtube.com/watch?v=g3LT_b6K0Mc

Image of physical building overlaid with RF sources, https://www.facebook.com/thethoughtemporium/posts/2162600763...

> When I made the first video and photoshoped my impression of what I thought this would look like, I never imagined it would actually be this close. It's official, our telescope can map the wifi in a building as if it were any other form of light.

ujikoluk · on Oct 24, 2024

Oh neat! Yes that's the kind of imagine I want to see

ujikoluk · on Oct 23, 2024

> Seriously, they decided that the program counter should be a general purpose register. Why???

Don't really understand this reaction. Why not? Seems to make for a nice regular design that the PC is just another register.

DevilStuff · on Oct 23, 2024

The big issue is that it doesn't really need to be a GPR. You never find yourself using the PC in instructions other than in, say, the occasional add instruction for switch case jumptables, or pushes / pops. So it ends up wasting instruction space, when you could've had an additional register, or encoded a zero register (which is what AARCH64 does nowadays).

Dwedit · on Oct 23, 2024

PC-relative instructions is how ARM loads 32-bit literals. "ldr,=label" is a psuedoinstruction that becomes relative to the PC, and it picks the nearest literal pool as the acutal address to read from.

robinsonb5 · on Oct 23, 2024

Funnily enough, for my own toy CPU project I elected to make the program counter a GPR, both readable and writable - because it allowed me to eliminate jump and branch instructions.

Reads from the PC return the address of the next instruction to be executed, so a simple exchange between two registers performs the branch, and supplies the return address. (I did end up special-casing the add instruction so that when adding to the PC the return address ends up in the source register.)

joosters · on Oct 23, 2024

But it is (or was originally) used in lots of places, not just jump tables, generally to do relative addressing, for example when you want to refer to data nearby, e.g.

ADD r0, r15, #200

LDR r1, [r15, #-100]

etc

DevilStuff · on Oct 23, 2024

Ah I miscommunicated, I still think PC can and should be used in places like the operand of an LDR / ADD. It's using it as the output of certain instructions (and allowing it to be used as such) that I take issue with. ARMv4T allowed you to set PC as the output of basically any instruction, allowing you to create cursed instructions like this lol:

eor pc, pc, pc

immibis · on Oct 23, 2024

Isn't writing to it except by a branch instruction undefined behaviour?

If you can use it as an operand, it has a register number, so you can use it as a result, unless you special-case one or the other, which ARM didn't do because it was supposed to be simple. They could have ignored it by omitting some write decode circuitry, but why?

DevilStuff · on Oct 23, 2024

It's not really UB, I've seen games do things like this before. Basically, all data processing instructions can now act as branch instructions, simply by having their dest be PC. Bowser's Inside Story on the DS for example liked to use EOR to write to PC, as a form of encrypting their pointers.

Yeah I think AARCH64 special cases it? Not too familiar with their encoding or how they achieved it. My guess as to why is that it allows you to use more helpful registers (e.g. a zero register) in data processing instructions.

I think I can see your point though - from the perspective of ARMv4T's design, which was to be a simple yet effective CPU, making the PC a GPR does its job. Nowadays the standards are different, but I can see why it made sense at the time.

Dwedit · on Oct 23, 2024

Not undefined behavior, just won't switch in or out of THUMB mode.

mmaniac · on Oct 23, 2024

The thing is... it isn't just another regular register. Reads from the PC are 8 bytes ahead, because that's where the PC is in the pipeline at read time. Branches are PC+8 relative for the same reason.

It's a design choice that makes sense in the classic RISC world where pipeline details are leaked to make implementation simpler. Delay slots in other RISCs work the same way. But it causes a lot of pain as implementations evolve beyond past the original design, and it's why Aarch64 junked a lot of Aarch32's quirks.

dfox · on Oct 23, 2024

This is one of the features of early ARM cores that show that the thing is not a classical RISC design, but traditional CISC core with RISC-like ISA and optimized internal buses.

userbinator · on Oct 23, 2024

Prefetching and other pipeline interactions need to be special-cased when any instruction reads or writes it.

joosters · on Oct 23, 2024

Yeah, it was that way for all previous ARM processors too, for exactly that reason. Adding special cases would have increased the transistor count, for no great benefit.

The only downside was that it exposed internal details of the pipelining IIRC. In the ARM2, a read of the PC would give the current instruction's location + 8, rather than its actual location, because by the time the instruction 'took place' the PC had moved on. So if/when you change the pipelining for future processors, you either make older code break, or have to special case the current behaviour of returning +8.

Anyway, I don't like their reaction. What they mean is 'this decision makes writing an emulator more tricky' but the author decides that this makes the chip designers stupid. If the author's reaction to problems is 'the chip designers were stupid and wrong, I'll write a blog post insulting them' then the problem is with the author.

DevilStuff · on Oct 23, 2024

Hey, I'm the author, sorry it came off that way, I was really just poking fun. I should've definitely phrased that better!

But no, I really think that making the program counter a GPR isn't a good design decision - there's pretty good reasons why no modern arches do things that way anymore. I admittedly was originally in the same boat when I first heard of ARMv4T - I thought putting the PC as a GPR was quite clean, but I soon realized it just wastes instruction space, makes branch prediction slightly more complex, decrease the number of available registers (increasing register pressure), all while providing marginal benefit to the programmer

joosters · on Oct 23, 2024

I think I misread your tone, sorry.

It's a good article though, the explanation of how multiplies work is nicely written.

DevilStuff · on Oct 23, 2024

No worries, I'm glad you brought it up so I could amend the article to be more respectful to those that I look up to :)

DevilStuff · on Oct 23, 2024

Anyway, I took the time to rewrite that paragraph a bit to be more respectful. :P Hopefully that'll come off better. The website take a few minutes to update.

kruador · on Oct 23, 2024

The reason was because it makes subroutine return and stack frame cleanup simpler.

You know this, but background for anyone else:

ARM's subroutine calling convention places the return address in a register, LR (which is itself a general purpose register, numbered R14). To save memory cycles - ARM1 was designed to take advantage of page mode DRAM - the processor features store-multiple and load-multiple instructions, which have a 16-bit bitfield to indicate which registers to store or load, and can be set to increment or decrement before or after each register is stored or loaded.

The easy way to set up a stack frame (the way mandated by many calling conventions that need to unwind the stack) is to use the Store Multiple, Decrement Before instruction, STMDB. Say you need to preserve R8, R9, R10:

STMDB R8-R10, LR

At the end of the function you can clean up the stack and return in a single instruction, Load Multiple with Increment After:

LDMIA R8-R10, PC

This seemed like a good decision to a team producing their first ever processor, on a minimal budget, needing to fit into 25,000 transistors and to keep the thermal design power cool enough to use a plastic package, because a ceramic package would have blown their budget.

Branch prediction wasn't a consideration as it didn't have branch prediction, and register pressure wasn't likely a consideration for a team going from the 3-register 6502, where the registers are far from orthogonal.

Also, it doesn't waste instruction space: you already need 4 bits to encode 14 registers, and it means that you don't need a 'branch indirect' instruction (you just do MOV PC,Rn) nor 'return' (MOV PC,LR if there's no stack frame to restore).

There is a branch instruction, but only so that it can accommodate a 24-bit immediate (implicitly left-shifted by 2 bits so that it actually addresses a 26-bit range, which was enough for the original 26-bit address space). The MOV immediate instruction can only manage up to 12 bits (14 if doing a shift-left with the barrel shifter), so I can see why Branch was included.

Indeed, mentioning the original 26-bit address space: this was because the processor status flags and mode bits were also available to read or write through R15, along with the program counter. A return (e.g. MOV PC,LR) has an additional bit indicating whether to restore the flags and processor state, indicated by an S suffix. If you were returning from an interrupt it was necessary to write "MOVS PC, LR" to ensure that the processor mode and flags were restored.

# It was acceptable in the 80s', It was acceptable at the time... #

Ken Shirriff has a great article "Reverse engineering the ARM1" at https://www.righto.com/2015/12/reverse-engineering-arm1-ance....

Getting back to multipliers:

ARM1 didn't have a multiply instruction at all, but experimenting with the ARM Evaluation System (an expansion for the BBC Micro) revealed that multiplying in software was just too slow.

ARM2 added the multiply and multiply-accumulate instructions to the instruction set. The implementation just used the Booth recoding, performing the additions through the ALU, and took up to 16 cycles to execute. In other words it only performed one Booth chunk per clock cycle, with early exit if there was no more work to do. And as in your article, it used the carry flag as an additional bit.

I suspect the documentation says 'the carry is unreliable' because the carry behaviour could be different between the ARM2 implementation and ARM7TDMI, when given the same operands. Or indeed between implementations of ARMv4, because the fast multiplier was an optional component if I recall correctly. The 'M' in ARM7TDMI indicates the presence of the fast multiplier.

Joker_vD · on Oct 23, 2024

> Adding special cases would have increased the transistor count, for no great benefit.

No, it's exactly backwards: supporting PC as a GPR requires special circuitry, especially in original ARM where PC was not even fully a part of the register file. Stephen Furber in his "VLSI RISC Architecture and Organization" describes in section 4.1 "Instruction Set and Datapath Definition" that quite a lot of additional activity happens when PC is involved (which may affect the instruction timings and require additional memory cycles).