xref: /haiku/docs/develop/kernel/arch/sparc/overview.rst (revision a5061ecec55353a5f394759473f1fd6df04890da)
1*a5061eceSAdrien DestuguesThe SPARC port
2*a5061eceSAdrien Destugues##############
3*a5061eceSAdrien Destugues
4*a5061eceSAdrien DestuguesThe SPARC port targets various machines from Sun product lineup. The initial effort is on the
5*a5061eceSAdrien DestuguesUltra 60 and Ultra 5, with plans to latter add the Sun T5120 and its newer CPU. This may change
6*a5061eceSAdrien Destuguesdepending on hardware donations and developer interest.
7*a5061eceSAdrien Destugues
8*a5061eceSAdrien DestuguesSupport for 32-bit versions of SPARC is currently not planned.
9*a5061eceSAdrien Destugues
10*a5061eceSAdrien DestuguesSPARC ABI
11*a5061eceSAdrien Destugues=========
12*a5061eceSAdrien Destugues
13*a5061eceSAdrien DestuguesThe SPARC architecture has 32 integer registers, divided as follows:
14*a5061eceSAdrien Destugues
15*a5061eceSAdrien Destugues- global registers (g0-g7)
16*a5061eceSAdrien Destugues- input (i0-i7)
17*a5061eceSAdrien Destugues- local (l0-l7)
18*a5061eceSAdrien Destugues- output (o0-o7)
19*a5061eceSAdrien Destugues
20*a5061eceSAdrien DestuguesParameter passing and return is done using the output registers, which are
21*a5061eceSAdrien Destuguesgenerally considered scratch registers and can be corrupted by the callee. The
22*a5061eceSAdrien Destuguescaller must take care of preserving them.
23*a5061eceSAdrien Destugues
24*a5061eceSAdrien DestuguesThe input and local registers are callee-saved, but we have hardware assistance
25*a5061eceSAdrien Destuguesin the form of a register window. There is an instruction to shift the registers
26*a5061eceSAdrien Destuguesso that:
27*a5061eceSAdrien Destugues
28*a5061eceSAdrien Destugues- o registers become i registers
29*a5061eceSAdrien Destugues- local and output registers are replaced with fresh sets, for use by the
30*a5061eceSAdrien Destugues  current function
31*a5061eceSAdrien Destugues- global registers are not affected
32*a5061eceSAdrien Destugues
33*a5061eceSAdrien DestuguesNote that as a side-effect, o7 is moved to i7, this is convenient because these
34*a5061eceSAdrien Destuguesare usually the stack and frame pointers, respectively. So basically this sets
35*a5061eceSAdrien Destuguesthe frame pointer for free.
36*a5061eceSAdrien Destugues
37*a5061eceSAdrien DestuguesSimple enough functions may end up using just the o registers, in that case
38*a5061eceSAdrien Destuguesnothing special is necessary, of course.
39*a5061eceSAdrien Destugues
40*a5061eceSAdrien DestuguesWhen shifting the register window, the extra registers come from the register
41*a5061eceSAdrien Destuguesstack in the CPU. This is not infinite, however, most implementations of SPARC
42*a5061eceSAdrien Destugueswill only have 8 windows available. When the internal stack is full, an overflow
43*a5061eceSAdrien Destuguestrap is raised, and the handler must free up old windows by storing them on the
44*a5061eceSAdrien Destuguesstack, likewise, when the internal stack is empty, an underflow trap must fill
45*a5061eceSAdrien Destuguesit back from the stack-saved data.
46*a5061eceSAdrien Destugues
47*a5061eceSAdrien DestuguesMisaligned memory access
48*a5061eceSAdrien Destugues========================
49*a5061eceSAdrien Destugues
50*a5061eceSAdrien DestuguesThe SPARC CPU is not designed to gracefully handle misaligned accesses.
51*a5061eceSAdrien DestuguesYou can access a single byte at any address, but 16-bit access only at even
52*a5061eceSAdrien Destuguesaddresses, 32bit access at multiple of 4 addresses, etc.
53*a5061eceSAdrien Destugues
54*a5061eceSAdrien DestuguesFor example, on x86, such accesses are not a problem, it is allowed and handled
55*a5061eceSAdrien Destuguesdirectly by the instructions doing the access. So there is no performance cost.
56*a5061eceSAdrien Destugues
57*a5061eceSAdrien DestuguesOn SPARC, however, such accesses will cause a SIGBUS. This means a trap handler
58*a5061eceSAdrien Destugueshas to catch the misaligned access and do it in software, byte by byte, then
59*a5061eceSAdrien Destuguesgive back control to the application. This is, of course, very slow, so we
60*a5061eceSAdrien Destuguesshould avoid it when possible.
61*a5061eceSAdrien Destugues
62*a5061eceSAdrien DestuguesFortunately, gcc knows about this, and will normally do the right thing:
63*a5061eceSAdrien Destugues
64*a5061eceSAdrien Destugues- For usual variables and structures, it will make sure to lay them out so that
65*a5061eceSAdrien Destugues  they are aligned. It relies on stack alignment, as well as malloc returning
66*a5061eceSAdrien Destugues  sufficiently aligned memory (as required by the C standard).
67*a5061eceSAdrien Destugues- On packed structure, gcc knows the data is misaligned, and will automatically
68*a5061eceSAdrien Destugues  use the appropriate way to access it (most likely, byte-by-byte).
69*a5061eceSAdrien Destugues
70*a5061eceSAdrien DestuguesThis leaves us with two undesirable cases:
71*a5061eceSAdrien Destugues
72*a5061eceSAdrien Destugues- Pointer arithmetics and casting. When computing addresses manually, it's
73*a5061eceSAdrien Destugues  possible to generate a misaligned address and cast it to a type with a wider
74*a5061eceSAdrien Destugues  alignment requirement. In this case, gcc may access the pointer using a
75*a5061eceSAdrien Destugues  multi byte instruction and cause a SIGBUS. Solution: make sure the struct
76*a5061eceSAdrien Destugues  is aligned, or declare it as packed so unaligned access are used instead.
77*a5061eceSAdrien Destugues- Access to hardware: it is a common pattern to declare a struct as packed,
78*a5061eceSAdrien Destugues  and map it to hardware registers. If the alignment isn't known, gcc will use
79*a5061eceSAdrien Destugues  byte by byte access. It seems volatile would cause gcc to use the proper way
80*a5061eceSAdrien Destugues  to access the struct, assuming that a volatile value is necessarily
81*a5061eceSAdrien Destugues  aligned as it should.
82*a5061eceSAdrien Destugues
83*a5061eceSAdrien DestuguesIn the end, we just need to be careful about pointer math resulting in unalined
84*a5061eceSAdrien Destuguesaccess. -Wcast-align helps with that, but it also raises a lot of false positives
85*a5061eceSAdrien Destugues(where the alignment is preserved even when casting to other types). So we
86*a5061eceSAdrien Destuguesenable it only as a warning for now. We will need to ceck the sigbus handler to
87*a5061eceSAdrien Destuguesidentify places where we do a lot of misaligned accesses that trigger it, and
88*a5061eceSAdrien Destuguesrework the code as needed. But in general, except for these cases, we're fine.
89*a5061eceSAdrien Destugues
90*a5061eceSAdrien DestuguesThe Ultrasparc MMUs
91*a5061eceSAdrien Destugues============================
92*a5061eceSAdrien Destugues
93*a5061eceSAdrien DestuguesFirst, a word of warning: the MMU was different in SPARCv8 (32bit)
94*a5061eceSAdrien Destuguesimplementations, and it was changed again on newer CPUs.
95*a5061eceSAdrien Destugues
96*a5061eceSAdrien DestuguesThe Ultrasparc-II we are supporting for now is documented in the Ultrasparc
97*a5061eceSAdrien Destuguesuser manual. There were some minor changes in the Ultrasparc-III to accomodate
98*a5061eceSAdrien Destugueslarger physical addresses. This was then standardized as JPS1, and Fujitsu
99*a5061eceSAdrien Destuguesalso implemented it.
100*a5061eceSAdrien Destugues
101*a5061eceSAdrien DestuguesLater on, the design was changed again, for example Ultrasparc T2 (UA2005
102*a5061eceSAdrien Destuguesarchitecture) uses a different data structure format to enlarge, again, the
103*a5061eceSAdrien Destuguesphysical and virtual address tags.
104*a5061eceSAdrien Destugues
105*a5061eceSAdrien DestuguesFor now te implementation is focused on Ultrasparc-II because that's what I
106*a5061eceSAdrien Destugueshave at hand, later on we will need support for the more recent systems.
107*a5061eceSAdrien Destugues
108*a5061eceSAdrien DestuguesUltrasparc-II MMU
109*a5061eceSAdrien Destugues-----------------
110*a5061eceSAdrien Destugues
111*a5061eceSAdrien DestuguesThere are actually two separate units for the instruction and data address
112*a5061eceSAdrien Destuguesspaces, known as I-MMU and D-MMU. They each implement a TLB (translation
113*a5061eceSAdrien Destugueslookaside buffer) for the recently accessed pages.
114*a5061eceSAdrien Destugues
115*a5061eceSAdrien DestuguesThis is pretty much all there is to the MMU hardware. No hardware page table
116*a5061eceSAdrien Destugueswalk is provided. However, there is some support for implementing a TSB
117*a5061eceSAdrien Destugues(Translation Storage Buffer) in the form of providing a way to compute an
118*a5061eceSAdrien Destuguesaddress into that buffer where the data for a missing page could be.
119*a5061eceSAdrien Destugues
120*a5061eceSAdrien DestuguesIt is up to software to manage the TSB (globally or per-process) and in general
121*a5061eceSAdrien Destugueskeep track of the mappings. This means we are relatively free to manage things
122*a5061eceSAdrien Destugueshowever we want, as long as eventually we can feed the iTLB and dTLB with the
123*a5061eceSAdrien Destuguesrelevant data from the MMU trap handler.
124*a5061eceSAdrien Destugues
125*a5061eceSAdrien DestuguesTo make sure we can handle the fault without recursing, we need to pin a few
126*a5061eceSAdrien Destuguesitems in place:
127*a5061eceSAdrien Destugues
128*a5061eceSAdrien DestuguesIn the TLB:
129*a5061eceSAdrien Destugues
130*a5061eceSAdrien Destugues- TLB miss handler code
131*a5061eceSAdrien Destugues- TSB and any linked data that the TLB miss handler may need
132*a5061eceSAdrien Destugues- asynchronous trap handlers and data
133*a5061eceSAdrien Destugues
134*a5061eceSAdrien DestuguesIn the TSB:
135*a5061eceSAdrien Destugues
136*a5061eceSAdrien Destugues- TSB-miss handling code
137*a5061eceSAdrien Destugues- Interrupt handlers code and data
138*a5061eceSAdrien Destugues
139*a5061eceSAdrien DestuguesSo, from a given virtual address (assuming we are using only 8K pages and a
140*a5061eceSAdrien Destugues512 entry TSB to keep things simple):
141*a5061eceSAdrien Destugues
142*a5061eceSAdrien DestuguesVA63-44 are unused and must be a sign extension of bit 43
143*a5061eceSAdrien DestuguesVA43-22 are the 'tag' used to match a TSB entry with a virtual address
144*a5061eceSAdrien DestuguesVA21-13 are the offset in the TSB at which to find a candidate entry
145*a5061eceSAdrien DestuguesVA12-0 are the offset in the 8K page, and used to form PA12-0 for the access
146*a5061eceSAdrien Destugues
147*a5061eceSAdrien DestuguesInside the TLBs, VA63-13 is stored, so there can be multiple entries matching
148*a5061eceSAdrien Destuguesthe same tag active at the same time, even when there is only one in the TSB.
149*a5061eceSAdrien DestuguesThe entries are rotated using a simple LRU scheme, unless they are locked of
150*a5061eceSAdrien Destuguescourse. Be careful to not fill a TLB with only locked entries! Also one must
151*a5061eceSAdrien Destuguestake care of not inserting a new mapping for a given VA without first removing
152*a5061eceSAdrien Destuguesany possible previous one (no need to worry about this when handling a TLB
153*a5061eceSAdrien Destuguesmiss however, as in that case we obviously know that there was no previous
154*a5061eceSAdrien Destuguesentry).
155*a5061eceSAdrien Destugues
156*a5061eceSAdrien DestuguesEntries also have a "context". This could for example be mapped to the process
157*a5061eceSAdrien DestuguesID, allowing to easily clear all entries related to a specific context.
158*a5061eceSAdrien Destugues
159*a5061eceSAdrien DestuguesTSB entries format
160*a5061eceSAdrien Destugues------------------
161*a5061eceSAdrien Destugues
162*a5061eceSAdrien DestuguesEach entry is composed of two 64bit values: "Tag" and "Data". The data uses the
163*a5061eceSAdrien Destuguessame format as the TLB entries, however the tag is different.
164*a5061eceSAdrien Destugues
165*a5061eceSAdrien DestuguesThey are as follow:
166*a5061eceSAdrien Destugues
167*a5061eceSAdrien DestuguesTag
168*a5061eceSAdrien Destugues***
169*a5061eceSAdrien Destugues
170*a5061eceSAdrien DestuguesBit 63: 'G' indicating a global entry, the context should be ignored.
171*a5061eceSAdrien DestuguesBits 60-48: context ID (13 bits)
172*a5061eceSAdrien DestuguesBits 41-0: VA63-22 as the 'tag' to identify this entry
173*a5061eceSAdrien Destugues
174*a5061eceSAdrien DestuguesData
175*a5061eceSAdrien Destugues****
176*a5061eceSAdrien Destugues
177*a5061eceSAdrien DestuguesBit 63: 'V' indicating a valid entry, if it's 0 the entry is unused.
178*a5061eceSAdrien DestuguesBits 62-61: size: 8K, 64K, 512K, 4MB
179*a5061eceSAdrien DestuguesBit 60: NFO, indicating No Fault Only
180*a5061eceSAdrien DestuguesBit 59: Invert Endianness of accesses to this page
181*a5061eceSAdrien DestuguesBits 58-50: reserved for use by software
182*a5061eceSAdrien DestuguesBits 49-41: reserved for diagnostics
183*a5061eceSAdrien DestuguesBits 40-13: Physical Address<40-13>
184*a5061eceSAdrien DestuguesBits 12-7: reserved for use by software
185*a5061eceSAdrien DestuguesBit 6: Lock in TLB
186*a5061eceSAdrien DestuguesBit 5: Cachable physical
187*a5061eceSAdrien DestuguesBit 4: Cachable virtual
188*a5061eceSAdrien DestuguesBit 3: Access has side effects (HW is mapped here, or DMA shared RAM)
189*a5061eceSAdrien DestuguesBit 2: Privileged
190*a5061eceSAdrien DestuguesBit 1: Writable
191*a5061eceSAdrien DestuguesBit 0: Global
192*a5061eceSAdrien Destugues
193*a5061eceSAdrien DestuguesTLB internal tag
194*a5061eceSAdrien Destugues****************
195*a5061eceSAdrien Destugues
196*a5061eceSAdrien DestuguesBits 63-13: VA<63-13>
197*a5061eceSAdrien DestuguesBits 12-0: context ID
198*a5061eceSAdrien Destugues
199*a5061eceSAdrien DestuguesConveniently, a 512 entries TSB fits exactly in a 8K page, so it can be locked
200*a5061eceSAdrien Destuguesin the TLB with a single entry there. However, it may be a wise idea to instead
201*a5061eceSAdrien Destuguesmap 64K (or more) of RAM locked as a single entry for all the things that needs
202*a5061eceSAdrien Destuguesto be accessed by the TLB miss trap handler, so we minimize the use of TLB
203*a5061eceSAdrien Destuguesentries.
204*a5061eceSAdrien Destugues
205*a5061eceSAdrien DestuguesLikewise, it may be useful to use 64K pages instead of 8K whenever possible.
206*a5061eceSAdrien DestuguesThe hardware provides some support for mixing the two sizes but it makes things
207*a5061eceSAdrien Destuguesa bit more complex. Let's start out with simpler things.
208*a5061eceSAdrien Destugues
209*a5061eceSAdrien DestuguesSoftware floating-point support
210*a5061eceSAdrien Destugues===============================
211*a5061eceSAdrien Destugues
212*a5061eceSAdrien DestuguesThe SPARC instruction set specifies instruction for handling long double
213*a5061eceSAdrien Destuguesvalues, however, no hardware implementation actually provides them. They
214*a5061eceSAdrien Destuguesgenerate a trap, which is expected to be handled by the softfloat library.
215*a5061eceSAdrien Destugues
216*a5061eceSAdrien DestuguesSince traps are slow, and gcc knows better, it will never generate those
217*a5061eceSAdrien Destuguesinstructions. Instead it directly calls into the C library, to functions
218*a5061eceSAdrien Destuguesspecified in the ABI and used to do long double math using softfloats.
219*a5061eceSAdrien Destugues
220*a5061eceSAdrien DestuguesThe support code for this is, in our case, compiled into both the kernel and
221*a5061eceSAdrien Destugueslibroot. It lives in src/system/libroot/os/arch/sparc/softfloat.c (and other
222*a5061eceSAdrien Destuguessupport files). This code was extracted from FreeBSD, rather than the glibc,
223*a5061eceSAdrien Destuguesbecause that made it much easier to get it building in the kernel.
224*a5061eceSAdrien Destugues
225*a5061eceSAdrien DestuguesOpenboot bootloader
226*a5061eceSAdrien Destugues===================
227*a5061eceSAdrien Destugues
228*a5061eceSAdrien DestuguesOpenboot is Sun's implementation of Open Firmware. So we should be able to share
229*a5061eceSAdrien Destuguesa lot of code with the PowerPC port. There are some differences however.
230*a5061eceSAdrien Destugues
231*a5061eceSAdrien DestuguesExecutable format
232*a5061eceSAdrien Destugues-----------------
233*a5061eceSAdrien Destugues
234*a5061eceSAdrien DestuguesPowerPC uses COFF. Sparc uses a.out, which is a lot simpler. According to the
235*a5061eceSAdrien Destuguesspec, some fields should be zeroed out, but they say implementation may chose
236*a5061eceSAdrien Destuguesto allow other values, so a standard a.out file works as well.
237*a5061eceSAdrien Destugues
238*a5061eceSAdrien DestuguesIt used to be possible to generate one with objcopy, but support was removed,
239*a5061eceSAdrien Destuguesso we now use elf2aout (imported from FreeBSD).
240*a5061eceSAdrien Destugues
241*a5061eceSAdrien DestuguesThe file is first loaded at 4000, then relocated to its load address (we use
242*a5061eceSAdrien Destugues202000 and executed there)
243*a5061eceSAdrien Destugues
244*a5061eceSAdrien DestuguesOpenfirmware prompt
245*a5061eceSAdrien Destugues-------------------
246*a5061eceSAdrien Destugues
247*a5061eceSAdrien DestuguesTo get the prompt on display, use STOP+A at boot until you get the "ok" prompt.
248*a5061eceSAdrien DestuguesOn some machines, if no keyboard is detected, the ROM will assume it is set up
249*a5061eceSAdrien Destuguesin headless mode, and will expect a BREAK+A on the serial port.
250*a5061eceSAdrien Destugues
251*a5061eceSAdrien DestuguesSTOP+N resets all variables to default values (in case you messed up input or
252*a5061eceSAdrien Destuguesoutput, for example).
253*a5061eceSAdrien Destugues
254*a5061eceSAdrien DestuguesUseful commands
255*a5061eceSAdrien Destugues---------------
256*a5061eceSAdrien Destugues
257*a5061eceSAdrien DestuguesDisable autoboot to get to the openboot prompt and stop there
258*a5061eceSAdrien Destugues
259*a5061eceSAdrien Destugues.. code-block:: text
260*a5061eceSAdrien Destugues
261*a5061eceSAdrien Destugues   setenv auto-boot? false
262*a5061eceSAdrien Destugues
263*a5061eceSAdrien DestuguesConfiguring for keyboard/framebuffer io
264*a5061eceSAdrien Destugues
265*a5061eceSAdrien Destugues.. code-block:: text
266*a5061eceSAdrien Destugues
267*a5061eceSAdrien Destugues   setenv screen-#columns 160
268*a5061eceSAdrien Destugues   setenv screen-#rows 49
269*a5061eceSAdrien Destugues   setenv output-device screen:r1920x1080x60
270*a5061eceSAdrien Destugues   setenv input-device keyboard
271*a5061eceSAdrien Destugues
272*a5061eceSAdrien DestuguesConfiguring openboot for serial port
273*a5061eceSAdrien Destugues
274*a5061eceSAdrien Destugues.. code-block:: text
275*a5061eceSAdrien Destugues
276*a5061eceSAdrien Destugues   setenv ttya-mode 38400,8,n,1,-
277*a5061eceSAdrien Destugues   setenv output-device ttya
278*a5061eceSAdrien Destugues   setenv input-device ttya
279*a5061eceSAdrien Destugues   reset
280*a5061eceSAdrien Destugues
281*a5061eceSAdrien DestuguesBoot from network
282*a5061eceSAdrien Destugues-----------------
283*a5061eceSAdrien Destugues
284*a5061eceSAdrien Destuguesstatic ip
285*a5061eceSAdrien Destugues*********
286*a5061eceSAdrien Destugues
287*a5061eceSAdrien DestuguesThis currently works best, because rarp does not let the called binary know the
288*a5061eceSAdrien DestuguesIP address. We need the IP address if we want to mount the root filesystem using
289*a5061eceSAdrien Destuguesremote_disk server.
290*a5061eceSAdrien Destugues
291*a5061eceSAdrien Destugues.. code-block:: text
292*a5061eceSAdrien Destugues
293*a5061eceSAdrien Destugues    boot net:192.168.1.2,somefile,192.168.1.89
294*a5061eceSAdrien Destugues
295*a5061eceSAdrien DestuguesThe first IP is the server from which to download (using TFTP), the second is
296*a5061eceSAdrien Destuguesthe client IP to use. Once the bootloader starts, it will detect that it is
297*a5061eceSAdrien Destuguesbooted from network and look for a the remote_disk_server on the same machine.
298*a5061eceSAdrien Destugues
299*a5061eceSAdrien Destuguesrarp
300*a5061eceSAdrien Destugues****
301*a5061eceSAdrien Destugues
302*a5061eceSAdrien DestuguesThis needs a reverse ARP server (easy to setup on any Linux system). You need
303*a5061eceSAdrien Destuguesto list the MAC address of the SPARC machine in /etc/ethers on the server. The
304*a5061eceSAdrien Destuguesmachine will get its IP, and will use TFTP to the server which replied, to get
305*a5061eceSAdrien Destuguesthe boot file from there.
306*a5061eceSAdrien Destugues
307*a5061eceSAdrien Destugues.. code-block:: text
308*a5061eceSAdrien Destugues
309*a5061eceSAdrien Destugues    boot net:,somefile
310*a5061eceSAdrien Destugues
311*a5061eceSAdrien Destugues(net is an alias to the network card and also sets the load address: /pci@1f,4000/network@1,1)
312*a5061eceSAdrien Destugues
313*a5061eceSAdrien Destuguesdhcp
314*a5061eceSAdrien Destugues****
315*a5061eceSAdrien Destugues
316*a5061eceSAdrien DestuguesThis needs a DHCP/BOOTP server configured to send the info about where to find
317*a5061eceSAdrien Destuguesthe file to load and boot.
318*a5061eceSAdrien Destugues
319*a5061eceSAdrien Destugues.. code-block:: text
320*a5061eceSAdrien Destugues
321*a5061eceSAdrien Destugues    boot net:dhcp
322*a5061eceSAdrien Destugues
323*a5061eceSAdrien Destugues
324*a5061eceSAdrien Destugues
325*a5061eceSAdrien DestuguesDebugging
326*a5061eceSAdrien Destugues---------
327*a5061eceSAdrien Destugues
328*a5061eceSAdrien Destugues.. code-block:: text
329*a5061eceSAdrien Destugues
330*a5061eceSAdrien Destugues   202000 dis (disassemble starting at 202000 until next return instruction)
331*a5061eceSAdrien Destugues   4000 1000 dump (dump 1000 bytes from address 4000)
332*a5061eceSAdrien Destugues   .registers (show global registers)
333*a5061eceSAdrien Destugues   .locals (show local/windowed registers)
334*a5061eceSAdrien Destugues   %pc dis (disassemble code being exectuted)
335*a5061eceSAdrien Destugues   ctrace (backtrace)
336