1 | ISASPEC - XML Based ISA Specification
|
---|
2 | =====================================
|
---|
3 |
|
---|
4 | isaspec provides a mechanism to describe an instruction set in XML, and
|
---|
5 | generate a disassembler and assembler. The intention is
|
---|
6 | to describe the instruction set more formally than hand-coded assembler
|
---|
7 | and disassembler, and better decouple the shader compiler from the
|
---|
8 | underlying instruction encoding to simplify dealing with instruction
|
---|
9 | encoding differences between generations of GPU.
|
---|
10 |
|
---|
11 | Benefits of a formal ISA description, compared to hand-coded assemblers
|
---|
12 | and disassemblers, include easier detection of new bit combinations that
|
---|
13 | were not seen before in previous generations due to more rigorous
|
---|
14 | description of bits that are expect to be '0' or '1' or 'x' (dontcare)
|
---|
15 | and verification that different encodings don't have conflicting bits
|
---|
16 | (i.e. that the specification cannot result in more than one valid
|
---|
17 | interpretation of any bit pattern).
|
---|
18 |
|
---|
19 | The isaspec tool and XML schema are intended to be generic (not specific
|
---|
20 | to ir3), although there are currently a couple limitations due to short-
|
---|
21 | cuts taken to get things up and running (which are mostly not inherent to
|
---|
22 | the XML schema, and should not be too difficult to remove from the py and
|
---|
23 | decode/disasm utility):
|
---|
24 |
|
---|
25 | * Maximum "field" size is 64b
|
---|
26 | * Fixed instruction size
|
---|
27 |
|
---|
28 | Often times, especially when new functionality is added in later gens
|
---|
29 | while retaining (or at least mostly retaining) backwards compatibility
|
---|
30 | with encodings used in earlier generations, the actual encoding can be
|
---|
31 | rather messy to describe. To support this, isaspec provides many flexible
|
---|
32 | mechanism, such as conditional overrides and derived fields. This not
|
---|
33 | only allows for describing an irregular instruction encoding, but also
|
---|
34 | allows matching an existing disasm syntax (which might not have been
|
---|
35 | design around the idea of disassembly based on a formal ISA description).
|
---|
36 |
|
---|
37 | Bitsets
|
---|
38 | -------
|
---|
39 |
|
---|
40 | The fundamental concept of matching a bit-pattern to an instruction
|
---|
41 | decoding/encoding is the concept of a hierarchical tree of bitsets.
|
---|
42 | This is intended to match how the HW decodes instructions, where certain
|
---|
43 | bits describe the instruction (and sub-encoding, and so on), and other
|
---|
44 | bits describe various operands to the instruction.
|
---|
45 |
|
---|
46 | Bitsets can also be used recursively as the type of a field described
|
---|
47 | in another bitset.
|
---|
48 |
|
---|
49 | The leaves of the tree of instruction bitsets represent every possible
|
---|
50 | instruction. Deciding which instruction a bitpattern is amounts to:
|
---|
51 |
|
---|
52 | .. code-block:: c
|
---|
53 |
|
---|
54 | m = (val & bitsets[n]->mask) & ~bitsets[n]->dontcare;
|
---|
55 |
|
---|
56 | if (m == bitsets[n]->match) {
|
---|
57 | /* we've found the instruction description */
|
---|
58 | }
|
---|
59 |
|
---|
60 | For example, the starting point to decode an ir3 instruction is a 64b
|
---|
61 | bitset:
|
---|
62 |
|
---|
63 | .. code-block:: xml
|
---|
64 |
|
---|
65 | <bitset name="#instruction" size="64">
|
---|
66 | <doc>
|
---|
67 | Encoding of an ir3 instruction. All instructions are 64b.
|
---|
68 | </doc>
|
---|
69 | </bitset>
|
---|
70 |
|
---|
71 | In the first level of instruction encoding hierarchy, the high three bits
|
---|
72 | group things into instruction "categories":
|
---|
73 |
|
---|
74 | .. code-block:: xml
|
---|
75 |
|
---|
76 | <bitset name="#instruction-cat2" extends="#instruction">
|
---|
77 | <field name="DST" low="32" high="39" type="#reg-gpr"/>
|
---|
78 | <field name="REPEAT" low="40" high="41" type="#rptN"/>
|
---|
79 | <field name="SAT" pos="42" type="bool" display="(sat)"/>
|
---|
80 | <field name="SS" pos="44" type="bool" display="(ss)"/>
|
---|
81 | <field name="UL" pos="45" type="bool" display="(ul)"/>
|
---|
82 | <field name="DST_CONV" pos="46" type="bool">
|
---|
83 | <doc>
|
---|
84 | Destination register is opposite precision as source, i.e.
|
---|
85 | if {FULL} is true then destination is half precision, and
|
---|
86 | visa versa.
|
---|
87 | </doc>
|
---|
88 | </field>
|
---|
89 | <derived name="DST_HALF" expr="#dest-half" type="bool" display="h"/>
|
---|
90 | <field name="EI" pos="47" type="bool" display="(ei)"/>
|
---|
91 | <field name="FULL" pos="52" type="bool">
|
---|
92 | <doc>Full precision source registers</doc>
|
---|
93 | </field>
|
---|
94 | <field name="JP" pos="59" type="bool" display="(jp)"/>
|
---|
95 | <field name="SY" pos="60" type="bool" display="(sy)"/>
|
---|
96 | <pattern low="61" high="63">010</pattern> <!-- cat2 -->
|
---|
97 | <!--
|
---|
98 | NOTE, both SRC1_R and SRC2_R are defined at this level because
|
---|
99 | SRC2_R is still a valid bit for (nopN) (REPEAT==0) for cat2
|
---|
100 | instructions with only a single src
|
---|
101 | -->
|
---|
102 | <field name="SRC1_R" pos="43" type="bool" display="(r)"/>
|
---|
103 | <field name="SRC2_R" pos="51" type="bool" display="(r)"/>
|
---|
104 | <derived name="ZERO" expr="#zero" type="bool" display=""/>
|
---|
105 | </bitset>
|
---|
106 |
|
---|
107 | The ``<pattern>`` elements are the part(s) that determine which leaf-node
|
---|
108 | bitset matches against a given bit pattern. The leaf node's match/mask/
|
---|
109 | dontcare bitmasks are a combination of those defined at the leaf node and
|
---|
110 | recursively each parent bitclass.
|
---|
111 |
|
---|
112 | For example, cat2 instructions (ALU instructions with up to two src
|
---|
113 | registers) can have either one or two source registers:
|
---|
114 |
|
---|
115 | .. code-block:: xml
|
---|
116 |
|
---|
117 | <bitset name="#instruction-cat2-1src" extends="#instruction-cat2">
|
---|
118 | <override expr="#cat2-cat3-nop-encoding">
|
---|
119 | <display>
|
---|
120 | {SY}{SS}{JP}{SAT}(nop{NOP}) {UL}{NAME} {EI}{DST_HALF}{DST}, {SRC1}
|
---|
121 | </display>
|
---|
122 | <derived name="NOP" expr="#cat2-cat3-nop-value" type="uint"/>
|
---|
123 | <field name="SRC1" low="0" high="15" type="#multisrc">
|
---|
124 | <param name="ZERO" as="SRC_R"/>
|
---|
125 | <param name="FULL"/>
|
---|
126 | </field>
|
---|
127 | </override>
|
---|
128 | <display>
|
---|
129 | {SY}{SS}{JP}{SAT}{REPEAT}{UL}{NAME} {EI}{DST_HALF}{DST}, {SRC1}
|
---|
130 | </display>
|
---|
131 | <pattern low="16" high="31">xxxxxxxxxxxxxxxx</pattern>
|
---|
132 | <pattern low="48" high="50">xxx</pattern> <!-- COND -->
|
---|
133 | <field name="SRC1" low="0" high="15" type="#multisrc">
|
---|
134 | <param name="SRC1_R" as="SRC_R"/>
|
---|
135 | <param name="FULL"/>
|
---|
136 | </field>
|
---|
137 | </bitset>
|
---|
138 |
|
---|
139 | <bitset name="absneg.f" extends="#instruction-cat2-1src">
|
---|
140 | <pattern low="53" high="58">000110</pattern>
|
---|
141 | </bitset>
|
---|
142 |
|
---|
143 | In this example, ``absneg.f`` is a concrete cat2 instruction (leaf node of
|
---|
144 | the bitset inheritance tree) which has a single src register. At the
|
---|
145 | ``#instruction-cat2-1src`` level, bits that are used for the 2nd src arg
|
---|
146 | and condition code (for cat2 instructions which use a condition code) are
|
---|
147 | defined as 'x' (dontcare), which matches our understanding of the hardware
|
---|
148 | (but also lets the disassembler flag cases where '1' bits show up in places
|
---|
149 | we don't expect, which may signal a new instruction (sub)encoding).
|
---|
150 |
|
---|
151 | You'll notice that ``SRC1`` refers back to a different bitset hierarchy
|
---|
152 | that describes various different src register encoding (used for cat2 and
|
---|
153 | cat4 instructions), i.e. GPR vs CONST vs relative GPR/CONST. For fields
|
---|
154 | which have bitset types, parameters can be "passed" in via ``<param>``
|
---|
155 | elements, which can be referred to by the display template string, and/or
|
---|
156 | expressions. For example, this helps to deal with cases where other fields
|
---|
157 | outside of that bitset control the encoding/decoding, such as in the
|
---|
158 | ``#multisrc`` example:
|
---|
159 |
|
---|
160 | .. code-block:: xml
|
---|
161 |
|
---|
162 | <bitset name="#multisrc" size="16">
|
---|
163 | <doc>
|
---|
164 | Encoding for instruction source which can be GPR/CONST/IMMED
|
---|
165 | or relative GPR/CONST.
|
---|
166 | </doc>
|
---|
167 | </bitset>
|
---|
168 |
|
---|
169 | ...
|
---|
170 |
|
---|
171 | <bitset name="#multisrc-gpr" extends="#multisrc">
|
---|
172 | <display>
|
---|
173 | {ABSNEG}{SRC_R}{HALF}{SRC}
|
---|
174 | </display>
|
---|
175 | <derived name="HALF" expr="#multisrc-half" type="bool" display="h"/>
|
---|
176 | <field name="SRC" low="0" high="7" type="#reg-gpr"/>
|
---|
177 | <pattern low="8" high="13">000000</pattern>
|
---|
178 | <field name="ABSNEG" low="14" high="15" type="#absneg"/>
|
---|
179 | </bitset>
|
---|
180 |
|
---|
181 | At some level in the bitset inheritance hierarchy, there is expected to be a
|
---|
182 | ``<display>`` element specifying a template string used during bitset
|
---|
183 | decoding. The display template consists of references to fields (which may
|
---|
184 | be derived fields) specified as ``{FIELDNAME}`` and other characters
|
---|
185 | which are just echoed through to the resulting decoded bitset.
|
---|
186 |
|
---|
187 | The special field reference ``{NAME}`` prints the name of the bitset. This is
|
---|
188 | often useful when the ``<display>`` element is at a higher level than the
|
---|
189 | leaves of the hierarchy, for example a whole class of similar instructions that
|
---|
190 | only differ in opcode.
|
---|
191 |
|
---|
192 | Sometimes there may be multiple variants of an instruction that must be
|
---|
193 | different bitsets, for example because they are so different that they must
|
---|
194 | derive from different bitsets, but they have the same name. Because bitset
|
---|
195 | names must be unique in the encoder, this can be a problem, but this can worked
|
---|
196 | around with the ``displayname`` attribute on the ``bitset`` which changes how
|
---|
197 | ``{NAME}`` is displayed but not the name used in the encoder. ``displayname``
|
---|
198 | is only useful for leaf bitsets.
|
---|
199 |
|
---|
200 | It is possible to define a line column alignment value per field to influence
|
---|
201 | the visual output. It needs to be specified as ``{FIELDNAME:align=xx}``.
|
---|
202 |
|
---|
203 | The ``<override>`` element will be described in the next section, but it
|
---|
204 | provides for both different decoded instruction syntax/mnemonics (when
|
---|
205 | simply providing a different display template string) as well as instruction
|
---|
206 | encoding where different ranges of bits have a different meaning based on
|
---|
207 | some other bitfield (or combination of bitfields). In this example it is
|
---|
208 | used to cover the cases where ``SRCn_R`` has a different meaning and a
|
---|
209 | different disassembly syntax depending on whether ``REPEAT`` equals zero.
|
---|
210 |
|
---|
211 | The ``<template>`` element can be used to represent a placeholder for a more
|
---|
212 | complex ``<display>`` substring.
|
---|
213 |
|
---|
214 | Overrides
|
---|
215 | ---------
|
---|
216 |
|
---|
217 | In many cases, a bitset is not convenient for describing the expected
|
---|
218 | disasm syntax, and/or interpretation of some range of bits differs based
|
---|
219 | on some other field or combination of fields. These *could* be modeled
|
---|
220 | as different derived bitsets, at the expense of a combinatorial explosion
|
---|
221 | of the size of the bitset inheritance tree. For example, *every* cat2
|
---|
222 | (and cat3) instruction has both a ``(nopN)`` interpretation in addition to
|
---|
223 | the ``(rptN`)`` interpretation.
|
---|
224 |
|
---|
225 | An ``<override>`` in a bitset allows to redefine the display string, and/or
|
---|
226 | field definitions from the default case. If the override's expr(ession)
|
---|
227 | evaluates to non-zero, ``<display>``, ``<field>``, and ``<derived>``
|
---|
228 | elements take precedence over what is defined in the top-level of the
|
---|
229 | bitset (i.e. the default case).
|
---|
230 |
|
---|
231 | Expressions
|
---|
232 | -----------
|
---|
233 |
|
---|
234 | Both ``<override>`` and ``<derived>`` fields make use of ``<expr>`` elements,
|
---|
235 | either defined inline, or defined and named at the top level and referred to
|
---|
236 | by name in multiple other places. An expression is a simple 'C' expression
|
---|
237 | which can reference fields (including other derived fields) with the same
|
---|
238 | ``{FIELDNAME}`` syntax as display template strings. For example:
|
---|
239 |
|
---|
240 | .. code-block:: xml
|
---|
241 |
|
---|
242 | <expr name="#cat2-cat3-nop-encoding">
|
---|
243 | (({SRC1_R} != 0) || ({SRC2_R} != 0)) && ({REPEAT} == 0)
|
---|
244 | </expr>
|
---|
245 |
|
---|
246 | In the case of ``<override>`` elements, the override applies if the expression
|
---|
247 | evaluates to non-zero. In the case of ``<derived>`` fields, the expression
|
---|
248 | evaluates to the value of the derived field.
|
---|
249 |
|
---|
250 | Branching
|
---|
251 | ---------
|
---|
252 |
|
---|
253 | isaspec supports a few special field types for printing branch destinations. If
|
---|
254 | ``isaspec_decode_options::branch_labels`` is true, a pre-pass over the program
|
---|
255 | to be disassembled determines which instructions are branch destinations and
|
---|
256 | then they are printed when disassembling, in addition to printing the name of
|
---|
257 | the destination when printing the field itself.
|
---|
258 |
|
---|
259 | There are two different types, which affect how the destination is computed. If
|
---|
260 | the field type is ``branch``, then the field is interpreted as a signed offset
|
---|
261 | from the current instruction. If the type is ``absbranch``, then it is
|
---|
262 | interpreted as an offset from the first instruction to be disassembled. In
|
---|
263 | either case, the offset is multiplied by the instruction size.
|
---|
264 |
|
---|
265 | For example, here is what a signed-offset unconditional jump instruction might
|
---|
266 | look like:
|
---|
267 |
|
---|
268 | .. code-block:: xml
|
---|
269 |
|
---|
270 | <bitset name="jump" extends="#instruction">
|
---|
271 | <display>
|
---|
272 | jump #{OFFSET}
|
---|
273 | </display>
|
---|
274 | <pattern low="26" high="31">110010</pattern> <!-- opcode goes here -->
|
---|
275 | <field name="OFFSET" low="0" high="25" type="branch"/>
|
---|
276 | </bitset>
|
---|
277 |
|
---|
278 | This would produce a disassembly like ``jump #l42`` if the destination is 42
|
---|
279 | instructions after the start of the disassembly. The destination would be
|
---|
280 | preceded by a line with just ``l42:``.
|
---|
281 |
|
---|
282 | ``branch`` and ``absbranch`` fields can additionally have a ``call="true"``
|
---|
283 | attribute. For now, this just changes the disassembly. In particular the label
|
---|
284 | prefix is changed to ``fxn`` and an extra empty line before the destination is
|
---|
285 | added to visually seperate the disassembly into functions. So, for example, a
|
---|
286 | call instruction defined like this:
|
---|
287 |
|
---|
288 | .. code-block:: xml
|
---|
289 |
|
---|
290 | <bitset name="call" extends="#instruction">
|
---|
291 | <display>
|
---|
292 | call #{OFFSET}
|
---|
293 | </display>
|
---|
294 | <pattern low="26" high="31">110010</pattern> <!-- opcode goes here -->
|
---|
295 | <field name="OFFSET" low="0" high="25" type="branch" call="true"/>
|
---|
296 | </bitset>
|
---|
297 |
|
---|
298 | will disassemble to ``call #fxn42``.
|
---|
299 |
|
---|
300 | Finally, users with special knowledge about where execution may start can define
|
---|
301 | "entrypoints" when disassembling which are printed like function call
|
---|
302 | destinations, with an extra empty line, but with an arbitrary user-defined
|
---|
303 | name. Names that are ``fxn`` or ``l`` followed by a number are discouraged
|
---|
304 | because they may clash with automatically-generated names.
|
---|
305 |
|
---|
306 | Encoding
|
---|
307 | --------
|
---|
308 |
|
---|
309 | To facilitate instruction encoding, ``<encode>`` elements can be provided
|
---|
310 | to teach the generated instruction packing code how to map from data structures
|
---|
311 | representing the IR to fields. For example:
|
---|
312 |
|
---|
313 | .. code-block:: xml
|
---|
314 |
|
---|
315 | <bitset name="#instruction" size="64">
|
---|
316 | <doc>
|
---|
317 | Encoding of an ir3 instruction. All instructions are 64b.
|
---|
318 | </doc>
|
---|
319 | <gen min="300"/>
|
---|
320 | <encode type="struct ir3_instruction *" case-prefix="OPC_">
|
---|
321 | <!--
|
---|
322 | Define mapping from encode src to individual fields,
|
---|
323 | which are common across all instruction categories
|
---|
324 | at the root instruction level
|
---|
325 |
|
---|
326 | Not all of these apply to all instructions, but we
|
---|
327 | can define mappings here for anything that is used
|
---|
328 | in more than one instruction category. For things
|
---|
329 | that are specific to a single instruction category,
|
---|
330 | mappings should be defined at that level instead.
|
---|
331 | -->
|
---|
332 | <map name="DST">src->regs[0]</map>
|
---|
333 | <map name="SRC1">src->regs[1]</map>
|
---|
334 | <map name="SRC2">src->regs[2]</map>
|
---|
335 | <map name="SRC3">src->regs[3]</map>
|
---|
336 | <map name="REPEAT">src->repeat</map>
|
---|
337 | <map name="SS">!!(src->flags & IR3_INSTR_SS)</map>
|
---|
338 | <map name="JP">!!(src->flags & IR3_INSTR_JP)</map>
|
---|
339 | <map name="SY">!!(src->flags & IR3_INSTR_SY)</map>
|
---|
340 | <map name="UL">!!(src->flags & IR3_INSTR_UL)</map>
|
---|
341 | <map name="EQ">0</map> <!-- We don't use this (yet) -->
|
---|
342 | <map name="SAT">!!(src->flags & IR3_INSTR_SAT)</map>
|
---|
343 | </encode>
|
---|
344 | </bitset>
|
---|
345 |
|
---|
346 | The ``type`` attribute specifies that the input to encoding an instruction
|
---|
347 | is a ``struct ir3_instruction *``. In the case of bitset hierarchies with
|
---|
348 | multiple possible leaf nodes, a ``case-prefix`` attribute should be supplied
|
---|
349 | along with a function that maps the bitset encode source to an enum value
|
---|
350 | with the specified prefix prepended to uppercase'd leaf node name. I.e. in
|
---|
351 | this case, "add.f" becomes ``OPC_ADD_F``.
|
---|
352 |
|
---|
353 | Individual ``<map>`` elements teach the encoder how to map from the encode
|
---|
354 | source to fields in the encoded instruction.
|
---|