J-core Roadmap

J-core processor development started in 2012 with the goal of implementing processors compatible with the existing SuperH instruction set, to be released as relevant patents expired. The original idea is that j1/j2/j3/j4 processors would support the corresponding SuperH sh1/sh2/sh3/sh4 instructions.

This roadmap goes through sh4 compatibility, then diverges sharply. All the historical technology we're examining (sh1, sh2, sh3, and sh4) was developed in-house by Hitachi last century, before they spun off SuperH development.

Specifically, sh2a came out over a decade after sh2, won't be out of patent in an interesting timeframe, and doesn't add anything we care about. Similarly we're ignoring the addition of SMP to SuperH in 2007 and doing it a clean way aimed at the needs of a modern OS like Linux, which has historical precedent and therefore is clearly patent free. The sh5 64-bit design is also uninteresting, because while SuperH's SHcompact ISA was a mature 32 bit RISC design taking advantage of over a decade of experience in that niche, and shmedia had some interesting vector ideas, overall it was the kind of second system approach that Itanium took to x86 of throwing out the existing instruction set rather than x86-64's minimal extension mode making carefully tailored changes to existing instructions which re-used the 32 bit circuitry. We prefer an x86-64 approach to Itanium for our eventual 64 bit version, while supporting a robust and useful co-processor interface to allow for things like vector operations, and even a backwards compatible FPU.

J2: 2015 release, sh2 compatible, no MMU, no FPU, 2-way SMP.

This is a NOMMU chip, implemented as Harvard architecture (separate Instruction and Data busses) with a 5 stage pipeline, with 16k cache (8k instruction, 8k data), supported by a memory controller interfacing with up to 256 megabytes of lpddr memory (in one low cost memory chip). It implements all sh2 instructions, plus three more instructions sh2 didn't: two instructions backported from sh3 to support bit shifts in C efficiently, and a CMPXCHG modeled after the original IBM S360 CAS instruction.

The last sh2 patent expired in October 2014, and j2 was publicly released in 2015. It was announced at Linuxcon Tokyo in 2015, with a second release at ELC 2016 adding 2-way SMP support.

J3 and J4 do not replace this core, it continues to be of interest in microcontroller applications. This may still to be our highest-volume chip 20 years from now, as we enter the era of ubiquitous and disposable computing.

J1: late 2016 release, sh1 compatible (Arduino Country)

Scaling smaller instead of larger, this is a tiny version of j-core aimed at smaller development boards. Although the resulting processor could in theory run Linux, the expected use case is retargeting the Arduino IDE to run bare-metal code out of SRAM.

The idea is to remove most of the memory logic from j2 (DRAM controller, icache, dcache, prefetch unit) and to replace the hardware multiplier with a microcoded shift-and-add implementation of sh1's 16x16->32 muls.w instruction and letting the compiler stitch that together to multiply larger numbers via libcc.a functions. (Microcoding sh2's multiply would take 33 clock cycles.)

This should trim J-core down small enough to fit in the largest Lattice FPGAs (and a correspondingly smaller ASIC form factor), which has the advantage that we should be able to use a fully open source VHDL toolchain for J1 by combining NVC with yosys (although significant work remains to be done there).

J3: 2017 release, sh3 compatible, MMU, 64-bit FPU

The J3 processor is on the roadmap for 2017, and should run SuperH sh3 code (the last patents on sh3 expired in December 2015).

This release adds MMU and FPU support (implemented using the upcoming co-processor interface), as well as three additional userspace instructions (two of which clear and set the S register, the third is cache prefetch). The addition of User/Supervisor mode adds privileged instructions. The MMU will allow full conventional Linux userspace (fork and more powerful mmap, memory protection, and so on)

The sh3 FPU was 32 bit only (c99 "float"), with 64 bit (c99 "double") added in sh4. We're implementing both at once (as a synthesis option), backporting the sh4 64 bit float to j3 (the relevant IEEE standard is from 1985 and out of patent).

Although sh3 also adds additional DSP instructions, these break the RISC style pipeline multiple ways, and we will not implement those (we have a new DSP design in development).

J4: 2018-ish release, sh4 compatible, multi-issue

The 20th anniversary of sh4 is November 2017, and j4 should come out in 2018.

There are a few new sh4 instructions in j4, but most of the work is adding multi-issue, which is the capability to execute more than one instruction per clock cycle. This adds two instances of a lot of the circuitry and makes them coordinate with each other.

J64: 2019-ish, new 64-bit mode

Instead of shmedia's Itanium-like approach, we plan a more x86-64 approach for j4, with 32 bit compatibility mode running stock sh4 code (at least in userspace), and a mode bit that switches to 64 bit register size and reinterprets a small subset of the existing instructions and leaves the rest alone.

What is Disposable Computing?

Imagine a box of cereal on a grocery store shelf with an E-paper display that updates itself every 30 seconds. The whole thing running from a tiny chip in one corner of the box that can run off a watch battery for 6 months, is deployed a bit like an RFID tag, and costs less than the "free toy inside".

J2 and J1 should do well in that market.

What is Ubiquitous Computing?

Mainframes were replaced by minicomputers, which were replaced by microcomputers, which are being replaced by smartphones. You didn't need the ship your card deck to the company's mainframe when your department had its own minicomputer, you didn't need to sign up for a timeslot on the minicomputer terminal down the hall when you had a machine on your desk, and you don't need the machine on your desk when you have a machine in your pocket.

What replaces the machine in your pocket? A computer that's already there, waiting for you to arrive. The crewmembers of the starship Enterprise don't need a phone, they just say "computer" and the network of nodes embedded in every wall and every appliance respond as needed. This is the fifth generation of the above progression: ubiquitous computing.

J4 should do well in that market.