1*a5061eceSAdrien DestuguesThe SPARC port 2*a5061eceSAdrien Destugues############## 3*a5061eceSAdrien Destugues 4*a5061eceSAdrien DestuguesThe SPARC port targets various machines from Sun product lineup. The initial effort is on the 5*a5061eceSAdrien DestuguesUltra 60 and Ultra 5, with plans to latter add the Sun T5120 and its newer CPU. This may change 6*a5061eceSAdrien Destuguesdepending on hardware donations and developer interest. 7*a5061eceSAdrien Destugues 8*a5061eceSAdrien DestuguesSupport for 32-bit versions of SPARC is currently not planned. 9*a5061eceSAdrien Destugues 10*a5061eceSAdrien DestuguesSPARC ABI 11*a5061eceSAdrien Destugues========= 12*a5061eceSAdrien Destugues 13*a5061eceSAdrien DestuguesThe SPARC architecture has 32 integer registers, divided as follows: 14*a5061eceSAdrien Destugues 15*a5061eceSAdrien Destugues- global registers (g0-g7) 16*a5061eceSAdrien Destugues- input (i0-i7) 17*a5061eceSAdrien Destugues- local (l0-l7) 18*a5061eceSAdrien Destugues- output (o0-o7) 19*a5061eceSAdrien Destugues 20*a5061eceSAdrien DestuguesParameter passing and return is done using the output registers, which are 21*a5061eceSAdrien Destuguesgenerally considered scratch registers and can be corrupted by the callee. The 22*a5061eceSAdrien Destuguescaller must take care of preserving them. 23*a5061eceSAdrien Destugues 24*a5061eceSAdrien DestuguesThe input and local registers are callee-saved, but we have hardware assistance 25*a5061eceSAdrien Destuguesin the form of a register window. There is an instruction to shift the registers 26*a5061eceSAdrien Destuguesso that: 27*a5061eceSAdrien Destugues 28*a5061eceSAdrien Destugues- o registers become i registers 29*a5061eceSAdrien Destugues- local and output registers are replaced with fresh sets, for use by the 30*a5061eceSAdrien Destugues current function 31*a5061eceSAdrien Destugues- global registers are not affected 32*a5061eceSAdrien Destugues 33*a5061eceSAdrien DestuguesNote that as a side-effect, o7 is moved to i7, this is convenient because these 34*a5061eceSAdrien Destuguesare usually the stack and frame pointers, respectively. So basically this sets 35*a5061eceSAdrien Destuguesthe frame pointer for free. 36*a5061eceSAdrien Destugues 37*a5061eceSAdrien DestuguesSimple enough functions may end up using just the o registers, in that case 38*a5061eceSAdrien Destuguesnothing special is necessary, of course. 39*a5061eceSAdrien Destugues 40*a5061eceSAdrien DestuguesWhen shifting the register window, the extra registers come from the register 41*a5061eceSAdrien Destuguesstack in the CPU. This is not infinite, however, most implementations of SPARC 42*a5061eceSAdrien Destugueswill only have 8 windows available. When the internal stack is full, an overflow 43*a5061eceSAdrien Destuguestrap is raised, and the handler must free up old windows by storing them on the 44*a5061eceSAdrien Destuguesstack, likewise, when the internal stack is empty, an underflow trap must fill 45*a5061eceSAdrien Destuguesit back from the stack-saved data. 46*a5061eceSAdrien Destugues 47*a5061eceSAdrien DestuguesMisaligned memory access 48*a5061eceSAdrien Destugues======================== 49*a5061eceSAdrien Destugues 50*a5061eceSAdrien DestuguesThe SPARC CPU is not designed to gracefully handle misaligned accesses. 51*a5061eceSAdrien DestuguesYou can access a single byte at any address, but 16-bit access only at even 52*a5061eceSAdrien Destuguesaddresses, 32bit access at multiple of 4 addresses, etc. 53*a5061eceSAdrien Destugues 54*a5061eceSAdrien DestuguesFor example, on x86, such accesses are not a problem, it is allowed and handled 55*a5061eceSAdrien Destuguesdirectly by the instructions doing the access. So there is no performance cost. 56*a5061eceSAdrien Destugues 57*a5061eceSAdrien DestuguesOn SPARC, however, such accesses will cause a SIGBUS. This means a trap handler 58*a5061eceSAdrien Destugueshas to catch the misaligned access and do it in software, byte by byte, then 59*a5061eceSAdrien Destuguesgive back control to the application. This is, of course, very slow, so we 60*a5061eceSAdrien Destuguesshould avoid it when possible. 61*a5061eceSAdrien Destugues 62*a5061eceSAdrien DestuguesFortunately, gcc knows about this, and will normally do the right thing: 63*a5061eceSAdrien Destugues 64*a5061eceSAdrien Destugues- For usual variables and structures, it will make sure to lay them out so that 65*a5061eceSAdrien Destugues they are aligned. It relies on stack alignment, as well as malloc returning 66*a5061eceSAdrien Destugues sufficiently aligned memory (as required by the C standard). 67*a5061eceSAdrien Destugues- On packed structure, gcc knows the data is misaligned, and will automatically 68*a5061eceSAdrien Destugues use the appropriate way to access it (most likely, byte-by-byte). 69*a5061eceSAdrien Destugues 70*a5061eceSAdrien DestuguesThis leaves us with two undesirable cases: 71*a5061eceSAdrien Destugues 72*a5061eceSAdrien Destugues- Pointer arithmetics and casting. When computing addresses manually, it's 73*a5061eceSAdrien Destugues possible to generate a misaligned address and cast it to a type with a wider 74*a5061eceSAdrien Destugues alignment requirement. In this case, gcc may access the pointer using a 75*a5061eceSAdrien Destugues multi byte instruction and cause a SIGBUS. Solution: make sure the struct 76*a5061eceSAdrien Destugues is aligned, or declare it as packed so unaligned access are used instead. 77*a5061eceSAdrien Destugues- Access to hardware: it is a common pattern to declare a struct as packed, 78*a5061eceSAdrien Destugues and map it to hardware registers. If the alignment isn't known, gcc will use 79*a5061eceSAdrien Destugues byte by byte access. It seems volatile would cause gcc to use the proper way 80*a5061eceSAdrien Destugues to access the struct, assuming that a volatile value is necessarily 81*a5061eceSAdrien Destugues aligned as it should. 82*a5061eceSAdrien Destugues 83*a5061eceSAdrien DestuguesIn the end, we just need to be careful about pointer math resulting in unalined 84*a5061eceSAdrien Destuguesaccess. -Wcast-align helps with that, but it also raises a lot of false positives 85*a5061eceSAdrien Destugues(where the alignment is preserved even when casting to other types). So we 86*a5061eceSAdrien Destuguesenable it only as a warning for now. We will need to ceck the sigbus handler to 87*a5061eceSAdrien Destuguesidentify places where we do a lot of misaligned accesses that trigger it, and 88*a5061eceSAdrien Destuguesrework the code as needed. But in general, except for these cases, we're fine. 89*a5061eceSAdrien Destugues 90*a5061eceSAdrien DestuguesThe Ultrasparc MMUs 91*a5061eceSAdrien Destugues============================ 92*a5061eceSAdrien Destugues 93*a5061eceSAdrien DestuguesFirst, a word of warning: the MMU was different in SPARCv8 (32bit) 94*a5061eceSAdrien Destuguesimplementations, and it was changed again on newer CPUs. 95*a5061eceSAdrien Destugues 96*a5061eceSAdrien DestuguesThe Ultrasparc-II we are supporting for now is documented in the Ultrasparc 97*a5061eceSAdrien Destuguesuser manual. There were some minor changes in the Ultrasparc-III to accomodate 98*a5061eceSAdrien Destugueslarger physical addresses. This was then standardized as JPS1, and Fujitsu 99*a5061eceSAdrien Destuguesalso implemented it. 100*a5061eceSAdrien Destugues 101*a5061eceSAdrien DestuguesLater on, the design was changed again, for example Ultrasparc T2 (UA2005 102*a5061eceSAdrien Destuguesarchitecture) uses a different data structure format to enlarge, again, the 103*a5061eceSAdrien Destuguesphysical and virtual address tags. 104*a5061eceSAdrien Destugues 105*a5061eceSAdrien DestuguesFor now te implementation is focused on Ultrasparc-II because that's what I 106*a5061eceSAdrien Destugueshave at hand, later on we will need support for the more recent systems. 107*a5061eceSAdrien Destugues 108*a5061eceSAdrien DestuguesUltrasparc-II MMU 109*a5061eceSAdrien Destugues----------------- 110*a5061eceSAdrien Destugues 111*a5061eceSAdrien DestuguesThere are actually two separate units for the instruction and data address 112*a5061eceSAdrien Destuguesspaces, known as I-MMU and D-MMU. They each implement a TLB (translation 113*a5061eceSAdrien Destugueslookaside buffer) for the recently accessed pages. 114*a5061eceSAdrien Destugues 115*a5061eceSAdrien DestuguesThis is pretty much all there is to the MMU hardware. No hardware page table 116*a5061eceSAdrien Destugueswalk is provided. However, there is some support for implementing a TSB 117*a5061eceSAdrien Destugues(Translation Storage Buffer) in the form of providing a way to compute an 118*a5061eceSAdrien Destuguesaddress into that buffer where the data for a missing page could be. 119*a5061eceSAdrien Destugues 120*a5061eceSAdrien DestuguesIt is up to software to manage the TSB (globally or per-process) and in general 121*a5061eceSAdrien Destugueskeep track of the mappings. This means we are relatively free to manage things 122*a5061eceSAdrien Destugueshowever we want, as long as eventually we can feed the iTLB and dTLB with the 123*a5061eceSAdrien Destuguesrelevant data from the MMU trap handler. 124*a5061eceSAdrien Destugues 125*a5061eceSAdrien DestuguesTo make sure we can handle the fault without recursing, we need to pin a few 126*a5061eceSAdrien Destuguesitems in place: 127*a5061eceSAdrien Destugues 128*a5061eceSAdrien DestuguesIn the TLB: 129*a5061eceSAdrien Destugues 130*a5061eceSAdrien Destugues- TLB miss handler code 131*a5061eceSAdrien Destugues- TSB and any linked data that the TLB miss handler may need 132*a5061eceSAdrien Destugues- asynchronous trap handlers and data 133*a5061eceSAdrien Destugues 134*a5061eceSAdrien DestuguesIn the TSB: 135*a5061eceSAdrien Destugues 136*a5061eceSAdrien Destugues- TSB-miss handling code 137*a5061eceSAdrien Destugues- Interrupt handlers code and data 138*a5061eceSAdrien Destugues 139*a5061eceSAdrien DestuguesSo, from a given virtual address (assuming we are using only 8K pages and a 140*a5061eceSAdrien Destugues512 entry TSB to keep things simple): 141*a5061eceSAdrien Destugues 142*a5061eceSAdrien DestuguesVA63-44 are unused and must be a sign extension of bit 43 143*a5061eceSAdrien DestuguesVA43-22 are the 'tag' used to match a TSB entry with a virtual address 144*a5061eceSAdrien DestuguesVA21-13 are the offset in the TSB at which to find a candidate entry 145*a5061eceSAdrien DestuguesVA12-0 are the offset in the 8K page, and used to form PA12-0 for the access 146*a5061eceSAdrien Destugues 147*a5061eceSAdrien DestuguesInside the TLBs, VA63-13 is stored, so there can be multiple entries matching 148*a5061eceSAdrien Destuguesthe same tag active at the same time, even when there is only one in the TSB. 149*a5061eceSAdrien DestuguesThe entries are rotated using a simple LRU scheme, unless they are locked of 150*a5061eceSAdrien Destuguescourse. Be careful to not fill a TLB with only locked entries! Also one must 151*a5061eceSAdrien Destuguestake care of not inserting a new mapping for a given VA without first removing 152*a5061eceSAdrien Destuguesany possible previous one (no need to worry about this when handling a TLB 153*a5061eceSAdrien Destuguesmiss however, as in that case we obviously know that there was no previous 154*a5061eceSAdrien Destuguesentry). 155*a5061eceSAdrien Destugues 156*a5061eceSAdrien DestuguesEntries also have a "context". This could for example be mapped to the process 157*a5061eceSAdrien DestuguesID, allowing to easily clear all entries related to a specific context. 158*a5061eceSAdrien Destugues 159*a5061eceSAdrien DestuguesTSB entries format 160*a5061eceSAdrien Destugues------------------ 161*a5061eceSAdrien Destugues 162*a5061eceSAdrien DestuguesEach entry is composed of two 64bit values: "Tag" and "Data". The data uses the 163*a5061eceSAdrien Destuguessame format as the TLB entries, however the tag is different. 164*a5061eceSAdrien Destugues 165*a5061eceSAdrien DestuguesThey are as follow: 166*a5061eceSAdrien Destugues 167*a5061eceSAdrien DestuguesTag 168*a5061eceSAdrien Destugues*** 169*a5061eceSAdrien Destugues 170*a5061eceSAdrien DestuguesBit 63: 'G' indicating a global entry, the context should be ignored. 171*a5061eceSAdrien DestuguesBits 60-48: context ID (13 bits) 172*a5061eceSAdrien DestuguesBits 41-0: VA63-22 as the 'tag' to identify this entry 173*a5061eceSAdrien Destugues 174*a5061eceSAdrien DestuguesData 175*a5061eceSAdrien Destugues**** 176*a5061eceSAdrien Destugues 177*a5061eceSAdrien DestuguesBit 63: 'V' indicating a valid entry, if it's 0 the entry is unused. 178*a5061eceSAdrien DestuguesBits 62-61: size: 8K, 64K, 512K, 4MB 179*a5061eceSAdrien DestuguesBit 60: NFO, indicating No Fault Only 180*a5061eceSAdrien DestuguesBit 59: Invert Endianness of accesses to this page 181*a5061eceSAdrien DestuguesBits 58-50: reserved for use by software 182*a5061eceSAdrien DestuguesBits 49-41: reserved for diagnostics 183*a5061eceSAdrien DestuguesBits 40-13: Physical Address<40-13> 184*a5061eceSAdrien DestuguesBits 12-7: reserved for use by software 185*a5061eceSAdrien DestuguesBit 6: Lock in TLB 186*a5061eceSAdrien DestuguesBit 5: Cachable physical 187*a5061eceSAdrien DestuguesBit 4: Cachable virtual 188*a5061eceSAdrien DestuguesBit 3: Access has side effects (HW is mapped here, or DMA shared RAM) 189*a5061eceSAdrien DestuguesBit 2: Privileged 190*a5061eceSAdrien DestuguesBit 1: Writable 191*a5061eceSAdrien DestuguesBit 0: Global 192*a5061eceSAdrien Destugues 193*a5061eceSAdrien DestuguesTLB internal tag 194*a5061eceSAdrien Destugues**************** 195*a5061eceSAdrien Destugues 196*a5061eceSAdrien DestuguesBits 63-13: VA<63-13> 197*a5061eceSAdrien DestuguesBits 12-0: context ID 198*a5061eceSAdrien Destugues 199*a5061eceSAdrien DestuguesConveniently, a 512 entries TSB fits exactly in a 8K page, so it can be locked 200*a5061eceSAdrien Destuguesin the TLB with a single entry there. However, it may be a wise idea to instead 201*a5061eceSAdrien Destuguesmap 64K (or more) of RAM locked as a single entry for all the things that needs 202*a5061eceSAdrien Destuguesto be accessed by the TLB miss trap handler, so we minimize the use of TLB 203*a5061eceSAdrien Destuguesentries. 204*a5061eceSAdrien Destugues 205*a5061eceSAdrien DestuguesLikewise, it may be useful to use 64K pages instead of 8K whenever possible. 206*a5061eceSAdrien DestuguesThe hardware provides some support for mixing the two sizes but it makes things 207*a5061eceSAdrien Destuguesa bit more complex. Let's start out with simpler things. 208*a5061eceSAdrien Destugues 209*a5061eceSAdrien DestuguesSoftware floating-point support 210*a5061eceSAdrien Destugues=============================== 211*a5061eceSAdrien Destugues 212*a5061eceSAdrien DestuguesThe SPARC instruction set specifies instruction for handling long double 213*a5061eceSAdrien Destuguesvalues, however, no hardware implementation actually provides them. They 214*a5061eceSAdrien Destuguesgenerate a trap, which is expected to be handled by the softfloat library. 215*a5061eceSAdrien Destugues 216*a5061eceSAdrien DestuguesSince traps are slow, and gcc knows better, it will never generate those 217*a5061eceSAdrien Destuguesinstructions. Instead it directly calls into the C library, to functions 218*a5061eceSAdrien Destuguesspecified in the ABI and used to do long double math using softfloats. 219*a5061eceSAdrien Destugues 220*a5061eceSAdrien DestuguesThe support code for this is, in our case, compiled into both the kernel and 221*a5061eceSAdrien Destugueslibroot. It lives in src/system/libroot/os/arch/sparc/softfloat.c (and other 222*a5061eceSAdrien Destuguessupport files). This code was extracted from FreeBSD, rather than the glibc, 223*a5061eceSAdrien Destuguesbecause that made it much easier to get it building in the kernel. 224*a5061eceSAdrien Destugues 225*a5061eceSAdrien DestuguesOpenboot bootloader 226*a5061eceSAdrien Destugues=================== 227*a5061eceSAdrien Destugues 228*a5061eceSAdrien DestuguesOpenboot is Sun's implementation of Open Firmware. So we should be able to share 229*a5061eceSAdrien Destuguesa lot of code with the PowerPC port. There are some differences however. 230*a5061eceSAdrien Destugues 231*a5061eceSAdrien DestuguesExecutable format 232*a5061eceSAdrien Destugues----------------- 233*a5061eceSAdrien Destugues 234*a5061eceSAdrien DestuguesPowerPC uses COFF. Sparc uses a.out, which is a lot simpler. According to the 235*a5061eceSAdrien Destuguesspec, some fields should be zeroed out, but they say implementation may chose 236*a5061eceSAdrien Destuguesto allow other values, so a standard a.out file works as well. 237*a5061eceSAdrien Destugues 238*a5061eceSAdrien DestuguesIt used to be possible to generate one with objcopy, but support was removed, 239*a5061eceSAdrien Destuguesso we now use elf2aout (imported from FreeBSD). 240*a5061eceSAdrien Destugues 241*a5061eceSAdrien DestuguesThe file is first loaded at 4000, then relocated to its load address (we use 242*a5061eceSAdrien Destugues202000 and executed there) 243*a5061eceSAdrien Destugues 244*a5061eceSAdrien DestuguesOpenfirmware prompt 245*a5061eceSAdrien Destugues------------------- 246*a5061eceSAdrien Destugues 247*a5061eceSAdrien DestuguesTo get the prompt on display, use STOP+A at boot until you get the "ok" prompt. 248*a5061eceSAdrien DestuguesOn some machines, if no keyboard is detected, the ROM will assume it is set up 249*a5061eceSAdrien Destuguesin headless mode, and will expect a BREAK+A on the serial port. 250*a5061eceSAdrien Destugues 251*a5061eceSAdrien DestuguesSTOP+N resets all variables to default values (in case you messed up input or 252*a5061eceSAdrien Destuguesoutput, for example). 253*a5061eceSAdrien Destugues 254*a5061eceSAdrien DestuguesUseful commands 255*a5061eceSAdrien Destugues--------------- 256*a5061eceSAdrien Destugues 257*a5061eceSAdrien DestuguesDisable autoboot to get to the openboot prompt and stop there 258*a5061eceSAdrien Destugues 259*a5061eceSAdrien Destugues.. code-block:: text 260*a5061eceSAdrien Destugues 261*a5061eceSAdrien Destugues setenv auto-boot? false 262*a5061eceSAdrien Destugues 263*a5061eceSAdrien DestuguesConfiguring for keyboard/framebuffer io 264*a5061eceSAdrien Destugues 265*a5061eceSAdrien Destugues.. code-block:: text 266*a5061eceSAdrien Destugues 267*a5061eceSAdrien Destugues setenv screen-#columns 160 268*a5061eceSAdrien Destugues setenv screen-#rows 49 269*a5061eceSAdrien Destugues setenv output-device screen:r1920x1080x60 270*a5061eceSAdrien Destugues setenv input-device keyboard 271*a5061eceSAdrien Destugues 272*a5061eceSAdrien DestuguesConfiguring openboot for serial port 273*a5061eceSAdrien Destugues 274*a5061eceSAdrien Destugues.. code-block:: text 275*a5061eceSAdrien Destugues 276*a5061eceSAdrien Destugues setenv ttya-mode 38400,8,n,1,- 277*a5061eceSAdrien Destugues setenv output-device ttya 278*a5061eceSAdrien Destugues setenv input-device ttya 279*a5061eceSAdrien Destugues reset 280*a5061eceSAdrien Destugues 281*a5061eceSAdrien DestuguesBoot from network 282*a5061eceSAdrien Destugues----------------- 283*a5061eceSAdrien Destugues 284*a5061eceSAdrien Destuguesstatic ip 285*a5061eceSAdrien Destugues********* 286*a5061eceSAdrien Destugues 287*a5061eceSAdrien DestuguesThis currently works best, because rarp does not let the called binary know the 288*a5061eceSAdrien DestuguesIP address. We need the IP address if we want to mount the root filesystem using 289*a5061eceSAdrien Destuguesremote_disk server. 290*a5061eceSAdrien Destugues 291*a5061eceSAdrien Destugues.. code-block:: text 292*a5061eceSAdrien Destugues 293*a5061eceSAdrien Destugues boot net:192.168.1.2,somefile,192.168.1.89 294*a5061eceSAdrien Destugues 295*a5061eceSAdrien DestuguesThe first IP is the server from which to download (using TFTP), the second is 296*a5061eceSAdrien Destuguesthe client IP to use. Once the bootloader starts, it will detect that it is 297*a5061eceSAdrien Destuguesbooted from network and look for a the remote_disk_server on the same machine. 298*a5061eceSAdrien Destugues 299*a5061eceSAdrien Destuguesrarp 300*a5061eceSAdrien Destugues**** 301*a5061eceSAdrien Destugues 302*a5061eceSAdrien DestuguesThis needs a reverse ARP server (easy to setup on any Linux system). You need 303*a5061eceSAdrien Destuguesto list the MAC address of the SPARC machine in /etc/ethers on the server. The 304*a5061eceSAdrien Destuguesmachine will get its IP, and will use TFTP to the server which replied, to get 305*a5061eceSAdrien Destuguesthe boot file from there. 306*a5061eceSAdrien Destugues 307*a5061eceSAdrien Destugues.. code-block:: text 308*a5061eceSAdrien Destugues 309*a5061eceSAdrien Destugues boot net:,somefile 310*a5061eceSAdrien Destugues 311*a5061eceSAdrien Destugues(net is an alias to the network card and also sets the load address: /pci@1f,4000/network@1,1) 312*a5061eceSAdrien Destugues 313*a5061eceSAdrien Destuguesdhcp 314*a5061eceSAdrien Destugues**** 315*a5061eceSAdrien Destugues 316*a5061eceSAdrien DestuguesThis needs a DHCP/BOOTP server configured to send the info about where to find 317*a5061eceSAdrien Destuguesthe file to load and boot. 318*a5061eceSAdrien Destugues 319*a5061eceSAdrien Destugues.. code-block:: text 320*a5061eceSAdrien Destugues 321*a5061eceSAdrien Destugues boot net:dhcp 322*a5061eceSAdrien Destugues 323*a5061eceSAdrien Destugues 324*a5061eceSAdrien Destugues 325*a5061eceSAdrien DestuguesDebugging 326*a5061eceSAdrien Destugues--------- 327*a5061eceSAdrien Destugues 328*a5061eceSAdrien Destugues.. code-block:: text 329*a5061eceSAdrien Destugues 330*a5061eceSAdrien Destugues 202000 dis (disassemble starting at 202000 until next return instruction) 331*a5061eceSAdrien Destugues 4000 1000 dump (dump 1000 bytes from address 4000) 332*a5061eceSAdrien Destugues .registers (show global registers) 333*a5061eceSAdrien Destugues .locals (show local/windowed registers) 334*a5061eceSAdrien Destugues %pc dis (disassemble code being exectuted) 335*a5061eceSAdrien Destugues ctrace (backtrace) 336