NuttX RTOS for PinePhone: Framebuffer

monocasa · on Dec 30, 2022

   // Copy the Entire Framebuffer to itself,
   // to fix the missing pixels.
   // Not sure why this works.

That has big lack of proper cache flushing energy. ARM-A device support tends to be where you need to get really intentional about managing your memory hierarchy. Smaller cores tend to have simple enough (or no) caches that they don't tend to get in your way much except for knife edge bugs. Bigger systems like x86 just tend to push the cache coherency out even to IO devices. ARM-A class SoCs are that sweet spot of a ton of caches between the CPU and main memory, but simple enough peripherals and fabric that only the CPU cores are coherent.

jbirer · on Dec 30, 2022

Yep, and I am not comfortable using software developed by people who don't know about this. Almost the entire ARMv8 TRM constantly mentions cache considerations.

fexecve · on Dec 31, 2022

Well, I am. Hacky workarounds like this might be ugly and slow, but modern chips are really fast, and I don't expect an RTOS graphics display to be the fastest thing in the world.

pixelfarmer · on Dec 31, 2022

Let me tell you one thing: Even modern chips are not fast enough to pull stupid stunts like that. This is topped off with a horrible routine which manually copies the whole thing bytewise (= 1/4th of a pixel per loop). Using SIMD (I think it is called ASIMD or something) you can transfer like 16 pixels in 32bpp per loop, just to give you an idea (i.e. 64x the amount of data).

Or to say it differently: 1440x720x32bpp is about 4MiB of memory which needs to be read and then written again, per frame. With 60FPS that is 8MiB*60FPS = 480MiB/s of data you shovel around for no reason. Add that loop overhead from this 1/4th of a pixel per loop routine and your awesome fast modern chip is being kept busy with garbage, eating your battery while doing so.

Now you want to avoid updating the whole screen all the time, but there is obviously a more fundamental issue and this should be fixed properly instead of keeping the CPU busy with nonsense. Also keep in mind that this also may lock or delay other data transfers in the system, depending on the system + bus arbitration settings etc.

fexecve · on Jan 1, 2023

That's true, battery consumption is an issue.

numpad0 · on Dec 31, 2022

Coherency bugs crash systems which trigger a reboot if device is engineered right. If anything the correct way of it is marginally slower.

pixelfarmer · on Dec 30, 2022

Exactly. The artifacts shown are stereotypical caching issues: Always same length (cache line size) and earlier written entries are more likely to be fully committed than later written ones but it is (seemingly) random in overall.

pengaru · on Dec 30, 2022

Even my x86 ThinkPad has had iGPU driver bugs surrounding LLC flushing with linux kernel updates over the years. They'd manifest with cache-line sized graphics noise/artifacts particularly when showing fullscreen animations.

monocasa · on Dec 30, 2022

That one I have under good authority is a hardware bug. PCI devices are supposed to be cache coherent, but Intel got a little too cute when integrating their iGPU into the uncore (and northbridge before it) at a level that looks like PCI to software but is their internal interconnects in actuality.

jancsika · on Dec 30, 2022

So is copying the thing to itself the only solution in the interim in order to force the thing to copy itself correctly into the thing?

Veserv · on Dec 30, 2022

You manually flush the addresses of the framebuffer via a cache maintenance instruction.

The internet tells me the phone has a A53 which should be ARMv8, so probably a loop of DC CIVAC (Data Cache Clean and Invalidate by Virtual Address to Coherency) instructions over the framebuffer. Though the buffer might be large enough to not even fit in the cache, so it might be more efficient to just flush the entire cache.

aidenn0 · on Dec 31, 2022

Or just map the FB as WT.

monocasa · on Dec 30, 2022

Not only is copying it over itself not the only solution (all you need to do is issue the right sequence of cache flushes over the affected area, so first CPU cache flush instructions, and then maybe an additional set of MMIO based cache flushes depending on how your L2 and/or L3 work), but copying over itself isn't even enough to be correct. They lucked out that it seems to work, but it's kind of by chance. That cache is more than in the right to still not flush some lines even with this big copy.

ajross · on Dec 30, 2022

No, ARM has cache management hardware (and a memory ordering model which you need to understand on the OOO cores). You just have to use it.

snvzz · on Dec 30, 2022

As cool as this is, Genode is much farther ahead[0] and has a much stronger security model (microkernel multiserver with capabilities).

AIUI they're planning to provide user-installable images by next release (2023-02).

It was gonna be 2022-11, but they chose to delay so that end users only see it polished. It is possible to try it by building it yourself, and it's pretty cool.

0. (grep for Pine) https://genodians.org/

dark_star · on Dec 31, 2022

Practically speaking, "farther ahead" may mean different things to different people. I tried to find a list of boards, systems on a chip (SoCs), or CPUs that Genode supports. There's no master list [1] but you can go digging through the handful of architecture-specific github repos. [2] The number of actually supported SoCs and boards is very limited.

NuttX and Zephyr both support a large number of SoCs and boards [3][4], and they each have a single git repo with a configuration system that lets you build different boards from that single repo. In terms of practical ease of use for hobby and commercial projects, I would say these projects are both far ahead of Genode if your hardware is not supported by Genode and is supported by either NuttX or Zephyr.

[1] https://genode.org/documentation/platforms/index

[2] https://github.com/orgs/genodelabs/repositories

[3] NuttX supported platforms: https://nuttx.apache.org/docs/latest/platforms/index.html

[4] Zephyr supported platforms: https://docs.zephyrproject.org/3.2.0/boards/index.html

snvzz · on Dec 31, 2022

>Practically speaking, "farther ahead" may mean different things to different people.

I completely agree, yet in this case what is meant is pretty clear: Pinephone support.

And Genode is far ahead in Pinephone support.

nanch · on Dec 30, 2022

Lup has such detailed and well-thought-out posts.

I used his guides when developing for the PineTime and everything worked perfectly. I would have been completely lost without his information.

Thank you Lup!

lupyuen · on Dec 31, 2022

Thank you so much! :-)

m463 · on Dec 30, 2022

I haven't heard of NuttX before.

How did it come about? Was it developed from scratch or released from an existing project?

als0 · on Dec 30, 2022

It’s quite old, developed from scratch by a single guy 20 years ago. It never reached mainstream popularity.

dark_star · on Dec 31, 2022

In my opinion NuttX is like a secret weapon. RTOS, good hardware support for many boards and CPUs, and a very high performance TCP/IP stack that can be interrupted by hard realtime tasks.

It's also growing in popularity now– I would say it hasn't reached mainstream popularity yet. :)

als0 · on Dec 30, 2022

What is the NuttX community like compared to Zephyr? Seems like they have similar goals and I noticed that the latter has ten times the number of open pull requests.

snovv_crash · on Dec 30, 2022

NuttX was the project of a single guy, over 20 years, which only recently got picked up by bigger players (eg. Xiaomi) when the project was moved to Apache Foundation governance. Before that it was very, very quirky, and while it was possible to do a lot, it was also very effort intensive. I would choose Zephyr just because it has less technical debt.

dark_star · on Dec 31, 2022

Maybe choose it if your embedded board, system on a chip, or CPU is supported. NuttX has a lot of supported boards, SoCs, and CPUs, and they differ from Zephyr's supported hardware.

[1] NuttX supported platforms: https://nuttx.apache.org/docs/latest/platforms/index.html

[2] Zephyr supported platforms: https://docs.zephyrproject.org/3.2.0/boards/index.html